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I n+roduct 1 on 


In May 1 9 83 , the Office of University Affairs of the 
National Aeronautics and Space Administration (NASA) signed a 
grant establishing a Remote Sensing Information Sciences Research 
Group (ISRG) at the University of California, Santa Barbara 
(UCSB). Research conducted under this grant has been used to 
extend and expand existing remote sensing research activities at 
UCSB In the areas of georef erenced Information systems, machine 
assisted Information extraction from Image data, artificial 
Intelligence, and vegetation analysis and modeling. This 
document represents a final report of work conducted under this 
grant (Grant # NASA NAGW-455) during the period May 1, 1985 to 
April 30, 1 986. 

ISRG research continues to focus on Improving the type, 
quantity, and quality of Information which can be derived from 
remotely sensed data. As we move Into the coming year of our 
research, we w I I I continue to focus on the needs of the remote 
sensing research and application community which will be served 
by the Earth Observing System (EOS) and Space Station, Including 
associated polar and co-orbltlng platforms. As discussed In the 
following material, we have begun to Integrate, extend, and expand 
existing remote sensing research activities at UCSB In the areas 
of Global Science, Georef erenced Information Systems, Machine 
Assisted Information Extraction from Image Data, and Artificial 
I nte I I I gence . 

As world population continues to Increase, there Is an ever 
expanding need for systems and techniques capable of acquiring. 


integrating, and analyzing information concerning the extent, use 
of, and changes In the major components of the earth's surface. 
NASA is playing an important role In the development of systems 
such as EOS which have significant data acquisition 
capabilities. To achieve the full potential of such systems, 
however, requires farsighted fundamental research be directed 
towards the scientific application of information systems 
technologies. These technologies can Improve the base upon which 
assessments may be made of both the current and changing status 
of the components of the biosphere, hydrosphere, lithosphere, and 
atmosphere . 

This report documuments accomplishments at UCSB in what we 
consider to be a five to ten year effort to prepare to take ful I 
advantage of the capabilities of the platforms and systems 
associated with Space Spation (e.g., EOS). Through this work, we 
have targeted fundamental research aimed at Improving our basic 
understanding of the role of Information systems technologies and 
artificial Intelligence techniques In the integration, 
manipulation and analysis of remotely sensed data for global 
scale studies. This coordinated research program Is possible as 
UCSB due to a unique combination of researchers with experience 
in all these areas. 

During the early years of this effort, the focus was on the 
Integration of these existing research activities at UCSB and the 
initiation and conduct of a number of research activities with a 
variety of NASA centers. We have also worked on background 
assessments of research and technology, as wel I as beginning 


steps towards f mp I ementat f on of a Pilot Land Data System (PLDS) 
for NASA Headquarters. 

We continue to be involved In PLDS development efforts, both 
through the Science Steering Group and a smal I research contract 
with NASA Ames Research Center. In addition, UCSB personnel have 
been, and are involved with: the EOS Data Systems Panel; Space 
Station Data User Working Group; Global Resources Information 
Systems; The World Bank; the United Nations Environment Programs 
Global Resources Information Database program; and, the Committee 
on Data Management and Computation (CODMAC) of the National 
Academy of Science. 

Furthermore, during this year we have been told that funds 
from NASA Code E/1 will be provided to supplement ISRG 
activities. These funds were proposed In September, 1985, to 
cover a range of tasks. Funding for this effort has just been 
received. Work accompl Ished In connection with this effort wl I I 
be reported In our upcoming progress report In December, 1986. 

The material which fol lows details ongoing work dl rect I y 
aided by this grant during the past year. Several of the projects 
have used this funding as a catalyst to aid other NASA offices In 
the research. In the Integration of remotely sensed and other 
data Into an Information sciences framework. The following 
sections discuss the details of the projects dealing with: 

* Pilot Land Data System; 

* Performance Analysis of Image Processing Algorithms for 
Classification of Natural Vegetation In the Mountains 
of Southern California; 

* KBGIS-ll: A Knowledge-Based Geographic Information 
System; 


* The Need for Improved Information Systems; 

* Support for Global Science: Remote Sensing's Challenge; 

These projects are discussed In some detail In the following 
section. Additional Information on many of these projects can be 
found In our January 1, 1986 Progress Report and Proposal. In 
this report, we Include copies of several new journal articles, 
funded In part by this grant. The appendices which follow contain 
committee memberships held by our staff with relevance to 
Information sciences, and recent presentations and symposia. 
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Abstract 

Remote sensing today uses a wide variety of techniques and methods. Resulting data are analyzed by man and 
machine, using both analog and digital technology. Tire newest and most important initiatives in the US. civilian 
space program currently revolve around the Space Station complex, which includes the core station as well as co- 
orbiting and polar satellite platforms. This proposed suite of platforms and support systems offers a unique potential 
for facilitating long term, multi-disciplinary scientific investigations on a truly global scale. 

Unlike previous generations of satellites, designed for relatively limited constituencies (e.g.. Landsat for the land 
scientist and Seasat for the oceanographic community), Space Station offers the potential to provide an integrated 
source of information which recognizes the scientific interest in investigating the dynamic coupling between the 
oceans, land surface, and atmosphere. 

Earth scientists already face problems that are truly global in extent Problems such as the global carbon balance 
and regional deforestation and desertification require new approaches, which combine multi-disciplinary, multi- 
national teams of researchers, employing advanced technologies to produce a type, quantity, and quality of data not 
previously available 

The challenge before the international scientific community is to continue to develop botli the infrastructure and 
expertise to. on the one hand, develop the science and technology of remote sensing, while on the other hand, 
develop an integrated understanding of our global life support system, and work toward a quantitative science of the 
biosphere. 


Introduction 

The newest and most important initiatives in the U.S. 
civilian space program currently revolve around the 
Space Station complex. The Space Station complex 
includes a space station, and its associated co-orbiting 
and polar satellite platforms. This proposed suite of plat- 
forms and support systems offers a unique potential for 
facilitating long term, multi-disciplinary scientific investiga- 
tions on a truly global scale. 

Basically, tire man-tended systems which are proposed 
for the various platforms have the capability of providing 
a wide range of data from both operational and research 
sensors. The large volumes of multispectral. multitem- 
poral data from these systems supported by efficient and 
effective data systems provide the potential for data 
continuity which has, to a large degree, been lacking from 
sensor systems operating on independent free flying plat- 
forms. The challenge to the remote sensing community is. 


in essence, two-fold. The first challenge is to get ready to 
handle the large volumes of data which will become 
available in the 1990 time frame. The second challenge 
to the remote sensing community is to bring the science 
and technology we are developing to broader consti- 
tuency, in the service of what we call global science: or as 
discussed by Botkin et al. (1984). “The Science of the 
Biosphere”. The biosphere is the large scale planetary 
system that includes and sustains life. 

From the perspective of scientists studying the earths 
surface, the most important component of the Space 
Station complex is the Earth Observing System (EOS) 
(NASA. 1984a: NASA. 1984b). EOS. based on the 
current design concept, has both active and passive earth 
surface imaging sensor systems as weil as atmospheric 
sounding systems (Table 1). EOS is an evolutionary step 
in our capabilities for remote sensing of the earth, and 
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TABLE 1 

EOS SURFACE IMAGING AND SOUNDING (Taken from NASA 1984a) 

INSTRUMENT 

MEASUREMENT 

SPATIAL RESOLUTION 

COVERAGE 


SISP-Surface tmaqinq & Sounding Package 


1. Moderate Resolution 
Imaging 
Spectrometer 
(MOOIS) 

Surface and Cloud imaging 
visible and infrared .4 
nm - 2.2 nm, 3-5 pm, 8-14 
ym resolution varying 
from 10 nm to .5 ym. 

1 km x 1 km pixels 
(4 km x 4 km open ocean) 

global , every 2 days 
during daytime plus IR 
nightime 

2. High Resolution 
Imaging 

Surface Imaging .4-2.2 nm. 
10-20 nm spectral 
resolution 

30 m x 30 ra pixels 

pointable to specific 
targets, 50 km swath 
width 

3. High Resolution 
Multi frequency 
Microwave 
Radiometer (HMMR) 

1-94 GHz passive 
microwave images in 
several bands 

1 km at 36. S GHz 

global, every 2 days 

4. Lidar Atmospheric 
Sounder and 
Altimeter (LASA) 

Visible and near infrared 
laser backscattering to 
measure atmospheric water 
vapor, surface topography, 
atmospheric scattering 
properties 

verticle resolution of 1 
km, surface topography 
to 3 m verticle resolu- 
tion every 3 km over land 

global, daily atmospheric 
sounding; continental 
topography total in 5 
years 


SAM-Sensinq with 

Active Microwaves 


5. Synthetic Aperture 
Radar (SAR) 

L, C, and X-Band Radar 
images of land, ocean, and 
ice surfaces at multiple 
incidence angles. 

30 m x 30 m pixels 

200 km swath width 
daily coverage in regions 
of shifting sea ice 

6. Radar Altimeter 

Surface topography of 
oceans and ice. signifi- 
cant wave height 

10 cm in elevation 
over oceans 

global with precisely 
repeating ground tracks 
every 10 days 

7. Scatterometer 

Sea surface wind stress to 
1 m/s, 10° in direction 
Ku band radar 

one sample at least 
every 50 km 

global, every 2 days 


may provide the earth, ocean, and atmospheric science 
communities with data to support integrated investiga- 
tions among disciplines and scientists from many nations 
on an unprecedented scale. Unlike the previous genera- 
tion of satellites, designed for relatively limited constituen- 
cies (e.g., Landsat for the land scientist and Seasat for die 
oceanographic community), EOS has the potential to 
provide an integrated source of information which recog- 
nizes the scientific interest in investigating the dynamic 
coupling between the oceans, land surface, and atmos- 
phere. 

In the same way that EOS represents an evolution in 
earthward-looking satellite technology, we believe the 
scientific objectives which EOS may help to accomplish 
can produce an evolutionary improvement in our under- 
standing of our planet Traditional branches of the earth 
sciences have been limited in scope to modest areas, and 
to the relatively narrow ranges of biophysical, geochem- 
ical and socioeconomic processes by the extent 
technology to measure, map, monitor, and model those 
processes. It is our hope and indeed appears to be the 


hope of the United States (U.S.) National Aeronautics and 
Space Administration (NASA) that EOS will foster and 
expand collaboration between scientific disciplines, 
continuing recent trends within the remote sensing 
community toward interdisciplinary science on an interna- 
tional scale. 

Historical Perspective 

The history of science shows a general trend towards 
specialization: individuals developing greater expertise in 
increasingly narrow fields. A portion of this specialization 
has been enhanced by technological developments. The 
microscope expanded our horizons inward; early optical 
microscopes evolved into todays computer-controlled 
electron microscopes and microprobes. The telescope 
expanded our horizons outward; technology has brought 
us to a time of electronically controlled active mirror tele- 
scopes and radio telescopes to probe the distant reaches 
of the universe. Early timepieces permitted navigation 




over the high seas and a time of rapid developments in 
the science of cartography. Today’s geographers and map 
makers use the tools of high technology, including both 
advanced digital computers and satellites, both for finding 
and then locating and plotting objects on the earth’s 
surface. 

Over the last decade, however, society has become 
more aware of problems which are fundamentally inter- 
disciplinary: the greenhouse effect, regional deforestation, 
and groundwater pollution are only a few examples. An 
understanding of the greenhouse effect requires not only 
knowledge of the effect of the atmosphere’s composition 
on radiative heat balance, but also atmospheric circula- 
tion, land/atmosphere interactions, ocean/atmosphere 
interactions, as well as biogeochemical cycles on the land, 
in the air, and in the ocean. The EOS program as 
presently constituted represents both a means to provide 
the data needed for such complex, large-area problems 
and an attempt to develop the infrastructure needed to 
address these problems. 

The history of remote sensing minors those trends 
which have occurred in science and technology at large 
(Figure 1). The tethered ballons of the 1850’s and 1860’s 
were the first remote sensing platforms. Balloons evolved 
to the aircraft of the early 1900’s, and then to the first 
satellite platforms which became available in the 1960’s. 
The Space Station currently being planned for the 1990’s 
includes a permanent manned presence in space. This 
station complex with its manned core, co-orbiting and 
polar platforms represents a major step in our observa- 
tional potential. The earliest sensors were the human eye, 
and the earliest recording devices tablet and scribes; 
panchromatic films developed in the 1830’s lead to the 
color films of the 1920's and these evolved into the 
electro-optical real synthetic aperature sensors of the 
1950’s and 1960’s. Until the 1960’s, data produced by 
remote sensor systems were analyzed using analog 


Figure 1. Simplified diagram of trends which have 
occurred in remote sensing over the past 150 years. 

TRENDS IN REMOTE SENSING 


local inventory • global surveys . 

SINGLE SOURCE DATA - — > WJLTIFLE SOURCE DATA 

SIMPLE IDENTIFICATION COMPLEX PROBLEM SOLVING 

MANUAL TECHNIQUES ■ ■■ ——— > MACHINE ASSISTED TECHNIQUES 

PHOTOGRAPHS > IMAGES 

ANALOG RECORDING ■ — — > DIGITAL RECORDING 

OPERAS ELECTRONIC SENSOR SYSTEMS 

BALLOONS SATELLITES 

COMPLEXITY > 


techniques. In the 1960’s and continuing through to the 
present, the digital computer has become an increasingly 
important analytic tool. 

Today’s remote sensing practice uses virtually every ■ 
technique developed in the past 100 years. Balloons, 
aircraft, and satellites all carry senors ranging from 
cameras to electronic scanners and sounders, and synth- 
etic aperature radars using virtually all of the elec- 
tromagnetic spectrum. Resulting data are analyzed by 
man and machine, using both analog and digital 
techniques sharing portions of the tasks . In a modem 
remote sensing laboratory, the light table and stereo 
viewer are found next to the computer terminal - and the 
modem student of remote sensing science recognizes the 
potential of each. 

The field of statistics developed in the 17th and 18th 
centuries provided science with a vital tool for under- 
standing natural processes. In the 1920’s and 1930’s, the 
development of sampling theory furthered applications of 
statistics. These developments, along with computer 
technology in the 1950’s and 1960s, provided the remote 
sensing community with necessary tools, for hypothesis 
testing and the design of field work to both verify and 
provide confidence limits on the products of our analyses. 
Further, statistics provides the theoretical background to 
move from simple identification of single source data to 
complex problem solving using multiple data sources. 
The distinction between data and information is elusive, 
and we realize that one scientist’s data may be another's 
information. Within the context of the science of the 
biosphere, vigorous application of sampling theory and 
statistical accuracy verification are required for at least two 
reasons. First, we are beginning to unambiguously 
demonstrate that existing maps are woefully inadequate 
to the task of providing baseline information for 
monitoring and modeling those dynamic processes that 
help to sustain life on the Earth (Botkin et al. 1984; 
Mann, 1985). Second, the multidisciplinary work we 
anticipate in the future must be rigorously based on 
ground truth and accuracy verification. 

Applications of multisource data are most important in 
modem remote sensing, and we often use the phrase 
“information system" to describe our concept (Estes, 
1984). An information system encompasses the entire 
flow of data, from sensor systems, through calibration and 
processing, through dissemination of derived information, 
to some end user and a decision process (see Figure 2). 
An important element of a new direction in remote 
sensing research is found in the recommendations of the 
EOS Science and Mission Requirements Working Group: 
“The Earth Observing System should be established as an 
information system..." (NASA, 1984a). The statement 
recognizes that if EOS is viewed simply as a senor plat- 
form, without considering the processing and distribution 
of resulting data and information to a user community the 
potential of EOS will never be realized. 
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Figure 2. The variety of data types and levels of data sources from which users may acquire data for a remote sensing 
analysis task. 






Current Trends 

Naisbitt in his book Megatrends (1984) discusses the 
new directions which he believes are transforming 
modem man and the planet on which we all live. Several 
of these megatrends are directly relevant to the challenges 
to be met by the remote sensing community as we move 
to take full advantage of evolving remote sensing and 
related information science technology. Megatrends 
discussed by Naisbitt include: the move from an industrial 
society towards an information society; from force 
technology to high technology with high touch (i.e. 
counter-balancing human response); short term to long 
term; centralized to decentralized; hierarchies to network- 
ing; either/or to multiple options; and, finally, with 
apologies to Mr. Naisbitt for not using “national economy 
to world economy”, we are moving from addressing local 
and regional science issues to topics of global concern. 

The first megatrend discussed by Naisbitt (1984) is 
what he calls our global move from an industrial to an 
information society. In Naisbitt’s own words, “None (of 
these megatrends) is more subtle, yet more explosive than 
the megashift from an industrial to an information soci- 
ety". This information society, says Naisbitt, had its begin- 
ning in 1956 and 1957. It is interesting to note here that 
this is the time frame for the launch of Sputnik and about 
the time we began to move from using the term aerial 
photographic interpretation to the term remote sensing. 

Remote sensing is an information generating technol- 
ogy. One only has to examine the Applications volume of 
the recent Manual of Remote Sensing (Estes and Thoriey, 
1983) to find eleven chapters and over eleven hundred 
pages, written by over one hundred and fifty authors, to 
see the tremendous variety of information being gener- 
ated from this technology. However, many of us deeply 
involved in this field feel frustrated. We feel that if we 
could only find our data more efficiently, manage it better, 
and use it in a better fashion we could do so much more. 
Better information systems are needed which link scien- 
tists at institutions not only with the U.S. but around the 
globe. 

In remote sensing we are also moving, albeit in this 
area most slowly, from forced technology to high tech 
with high touch. To see that remote sensing is high tech 
we need only to look again to the second edition of the 
Manual of Remote Sensing (Simonett and Ulaby, 
1983). Yet in the development of this technology users 
have not always been well served. Often, we as scientists, 
have been presented with systems by the engineering 
community and asked “What can you do with this?" 
While this has changed somewhat in recent years, science 
and applications data users must be brought into the 
mission planning process at die earliest possible moment. 
There is still a nagging suspicion on the part of many in 
the remote sensing community that our voices are not 
always heard. - 


It is obvious that we. as scientists interested in our own 
data needs, may ask for too much. However, we hope 
that NASA, ESA, and other agencies involved in the 
forefront of remote sensing will listen to a community 
which recognizes the information potential of remote 
sensing, yet is leery of die impacts of commercialization 
on our long term science access to satellite data - a 
community fearful that space stations and its associated 
systems, even including EOS, will further erode what is 
currently a bare minimum and patently inadequate 
funding for basic and applied remote sensing oriented, 
research. We have the high tech, yes, but what is needed, 
as Naisbitt says, is more high touch, a counter-balancing 
human response that recognized the needs and concerns 
of die scientists and applications of remotely sensed data. 
Our goal is to do die best science possible (Estep, 1968), 
to employ the fruits of our marvelous technology to 
provide an adequate standard of living for mankind. 

In a more subtle way within this high tech/high touch 
trend, we also see an increase in the use of techniques 
from artificial intelligence as a trend towards high touch. 
Particularly, work in the area of expert systems and 
natural languages is showing potential for making 
complex processing of remotely sensed data easier and 
more understandable for science and application users 
alike. These techniques, if properly applied, show poten- 
tial for allowing die less-trained individual to take full 
advantage of the range of services offered by a system 
such as EOS. Research and development in this whole 
area is, and should be, directed at letting scientists and 
users act more like scientists and users than librarians, 
communications specialists, computer scientists, and so 
on. 

Analogous to Naisbitt’s short term/long.term are the 
trends we have seen in the shifts from applied to basic 
research within NASA since the launch of Landsat 1. 
Prior to 1972, many researchers in the U.S. and around 
the world in the field of remote sensing were doing 
fundamental work on the digital processing of aircraft 
multispectral scanner data. Overnight, Landsat 1 provided 
a large volume of data in digital format which was not a 
research, but an operational satellite. Instead of building a 
solid research foundation, we in. the U.S. moved directly 
towards applications with a new sensor which had an 
inadequate information system, and basic research found- 
ation.to support of large number of applications. 

In recent years (1979-1980), we have seen a shift 
within NASA back to a more basic research emphasis, 
looking at the use of remote sensing concerning problems 
requiring long range research. The recent Global Biology/ 
Global Habitability and the EOS- science and mission 
requirements documents produced by NASA make this 
trend- dear' (NASA 1983a, NASA 1983b, NASA 1984a, 
NASA 1984b). This trend may not be as dear in NAS As 
actions in the information sdences. The current data pilots 
funded by NASA code El are aimed at employing existing 
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technologies to improve access to processing of, and 
interaction with, remote sensing data and scientists using 
that data. We believe that this is proper in this case. There 
is a very large and compelling need here to do this. Yet, 
NASA must not lose sight of the need for basic research 
in the information sciences as well. If we are to gain the 
maximum benefit from new EOS senor systems (such as 
the multifrequency, multiple look angle Synthetic Apera- 
ture Radar and the High Resolution Imaging Spectrome- 
ter), let alone combine data from these space-based 
sensors with other ancillary data types, a great deal of 
fundamental thought and work is needed. 

The next two trends are centralized to decentralized, 
and hierarchies to distributed systems. These trends also 
illustrate a change from single-investigation research to 
multi-disciplinary, multi-institutional research as expressed 
in the EOS Science and Mission Requirements docu- 
ments (NASA 1984a, NASA 1984b). In the past only a 
few countries and research centers, (principally federal 
laboratories and a few universities) had the computing 
capability to acquire and deal effectively with satellite 
data. We take hierarchies in Naisbitt’s sense to be indi- 
vidual organizations geared toward working indepen- 
dently, in contrast to networking which attempts to facili- 
tate die interaction of these organizations. What we have 
in remote sensing today are hierarchies, where central 
facilities distribute data and processing knowledge to the 
community. Today countries and institutions in all parts of 
the world have acquisition and processing capabilities. 
This presents a. new protocol, associated with the idea of 
networks as opposed to hierarchies. What is required are 
more efficient and effective networks for the exchange of 
data on a global scale. Data/Information Systems which 
facilitate communication among scientists around the 
world are working to improve our understanding of bio- 
spheric processes. 

The megashift from “either/or” to “multiple option" can 
be related to the use of geographic information systems 
which facilitates the multi-options, we have in remote 
sensing today. Early on in machine assisted processing of 
remotely sensed data, there was a push to obtain all infor- 
mation on a given problem from a single multispectral 
satellite image alone. When researchers began to realize 
that the information in the spatial and spectral domains 
represented in a single image was insufficient to many 
tasks, we began to explore the multi-temporal aspects of 
the data. Once we exhausted this possibility, we began to 
explore the potential of incorporating digital terrain data. 
Later we digitized soils, geologic and landuse maps. Crop 
phenologies were plotted as trajectories and processed. 
Prior probabilities and logic were employed to assess the 
nature and magnitude of change in a given area. 

Many researchers now employ a wide variety of 
spatially-referenced data in remote sensing research. The 
synergism between geographic information system 
technology and remote sensing truly enhancesjthe poten- 


tial of each. For remote sensing data to be most useful 
they must typically be combined with other data types. In 
contrast, the quality of geographic information systems 
depends on the currency of the data they contain. 
Remote sensing can update GIS data planes while GIS 
can provide for the efficient use of the ancillary data 
required by remote sensing (Estes, 1984). 

Finally, in the use of remote sensing, we are moving 
toward addressing issues which are truly global in nature. 
That is, we now have the potential to collect consistent 
global-scale data sets from which information may be 
derived and whose accuracy is verifiable. Past estimates 
of important global parameters (such as vegetation types, 
primary productivity, and biomass) have been difficult to 
develop and virtually impossible to verify. EOS can be 
one of the keys to unlocking global science. Yet, to 
continue this metaphor, it will be information systems 
which will allow us to turn this key in the lock. Improved 
information systems will facilitate our ability to conduct 
global research in an effective manner. 

Analytic Forms and Objectives 

This is particularly important as we look to the types of 
analyses that will be conducted using the EOS informa- 
tion system. Examples of these analyses will generally 
take one of four explanatory forms and be oriented 
toward at least three objects which will be discussed in 
some detail here. Explanatory forms include: (1) 
morphometric analysis, (2) cause-and effect analysis, (3) 
temporal analysis, and (4) functional and ecological 
systems analysis (Estes, Jensen and Simonett, 1980). 
Objectives include: (1) inventory, (2) mapping, (3) 
monitoring, and (4) modeling (Estes, 1985). 

Morphometric Analysis 

Scientific studies typically require measurement to 
determine the morphology of phenomena, i.e., their form 
and structure. Measured properties of phenomena may 
be generally classified as physical, spatial (geographical), 
or temporal properties. It is important to obtain quantita- 
tive information concerning these parameters in addition 
to descriptive evaluation. 

Scientific investigations may require data ranging from 
simple in site observations where the spatial properties 
are not important, to complex analyses where the proper- 
ties of phenomena are most significant when viewed in 
relation to their spatial association with other phenomena. 
Field investigations are typically costly and site specific, 
providing only point observations that must be interpo- 
lated to yield a geographical surface. Remote sensing, 
however, can provide both point (per picture element)and 
areal physical property information. Remote sensing can 
play an important role in providing information on a 


number of biophysical properties, such as geometry (size, 
shape, arrangement, etc.), color or visual appearance, 
temperature, dielectric nature, moisture content, and 
organic and inorganic composition (Jensen, 1983). 

A fundamental characteristic of remote sensing when 
applied to morphometric analysis is that a given scale of 
observation may provide specific types of categorical 
information by itself, and it can be used as a method of 
stratifying an area for subsequent analysis. 

Cause-and-Effect Analysis 

Man has always examined the processes acting on his 
surroundings and attempted rational explanations of the 
causes. The synoptic view has important implications for 
regional studies which attempt to identify cause-and-effect 
relationships. The establishment of cause-and-effect 
relationships is important to researchers in all branches of 
science. Increasing our ability to perceive effects which 
may be beyond direct visual experience can provide 
insights which may lead to improved understanding of 
environmental phenomena and processes. 

EOS and remote sensing in general offers scientists tire 
capability to extend our understanding of effects which 
were until now beyond the limits of our perception and 
effective measurement This may include recording a 
given wavelength of energy outside the visible spectrum 
and/or assume a viewing perspective for a sufficient 
period of time (e.g. geostationary satellite) to adequately 
monitor phenomena. For example, thermal infrared scan- 
ners can record temperature differences in a river to 
pinpoint the location and provide a spatial perspective on 
a thermal plume undetectable by the unaided eye (Estes, 
et al, 1983). Similarly, the reflective near infrared has 
been employed to detect biophysical stress (i.e., effect) 
before the cause (e.g., loss of moisture from pathogens) is 
detectable in the visible spectrum (Jensen, 1983). 

Temporal Modes of Explanation 

While in many scientific studies spatial variations are 
prime concern we must also consider the temporal 
domain. EOS sensor system for surface imaging and 
sounding show a variety of temporal resolutions consis- 
tent with science needs (see Table 1). A concern with time 
in science stems from two principal considerations: 

(a) Explanation of observed phenomena typically 
involve an analysis of processes and sequences which 
occur through time. 

(b) The rates of change for a given phenomenon 
constitute an important characteristic. 

Change in many scientific studies is synonymous with 
process and sequence. To be able to identify and monitor 
change accurately and consistently within a spatial 
framework is important. The ability to view objects and/ 
or phenomena in their spatial context through time in a 


consistent manner is an important contribution of remote 
sensing to global science. Inconsistent data plague 
temporal studies. EOS data will be our internally consis- 
tent, longitudinal (i.e., temporal) data set 

The acquisition of a single datum or multi-temporal 
data depends upon the application. If the study is 
* primarily concerned with relatively static phenomena 
(e.g., soils, slopes, rock types), single or widely spaced 
observations may be sufficient If, on the other hand, 
dynamic phenomena (e.g., runoff, flooding, crop growth, 
moisture response) are involved, the temporal resolution 
of EOS provide data to meet a variety of science require- 
ments. For an example see Table 2. In addition, by 
interrogating an interaction matrix between static and 
dynamic phenomena developed from remote sensing 
supplied data, much detailed information concerning the 
functioning of both static and dynamic elements present 
in a given landscape can be achieved (Estes, Jensen and 
Simonett, 1980). 

Functional and Ecological Systems Analysis 

Data must be transformed into useable information in 
order to understand a process or to make a decision. 

- While researchers often require spatially accurate data for 
both micro- and macroscale phenomena, efficient or 
accurate methods commonly do not exist for collecting 
these data. Remote sensing systems offer die means to 
acquire such data, and are beginning to be applied to 
systems analysis at both ends of the spatial continuum. 
Researchers at tire University of California, Santa Barbara 
(UCSB), have been working with NASA personnel to 
understand the relationship between reflectance from 
major species in die North American Boreal Forest as well 
as leaf area index and biomass. The research involves the 
gathering of detailed field data and correlating, die infor- 
mation derived with data acquired using helicopters, 
aircraft, and satellites. 

In addition to these studies, UCSB and NASA resear- 
chers have been examining the potential of using 
advanced very high resolution radiometer (AVHRR) and 
Landsat imagery to map within known accuracy limits the 
areal extent and spatial distributions of major forests types 
in die North American Boreal Forest (see Figure 3). The 
combination of these research projects is directed at 
improving our scientific understanding of the cycling of 
carbon and other elemental materials (Atjay et al, 1979; 
and NASA, 1983a). In addition, scientists with remote 
sensing backgrounds are examining the information 
gained by the application of models to a number of phys- 
ical processes and cultural phenomena (e.g., crop inven- 
tories, monitoring snowmelt runoff, developing models 
for monitoring urban expansion, and energy consump- 
tion). EOS will greatly fadliate these types of studies. 

The use of remotely sensed data as input to numerical 
models together is complex to implement, but attractive 
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TABLE 2 SAMPLE SCIENCE OBSERVATIONAL NEEDS (Taken from NASA 1984a) 




ACCURACY 


SPATIAL 

OBSERVATION 

SPECTRAL 

PARAMETER 

APPLICATION 

DESIRED 

REQUIRE0 

APPROACH 

RES. 

FRE0UENCY 

RES. 

Soil Features 








o Moisture 

Hydrologic & 

5 moisture 

5 moisture 






geochemical 

cycles 

levels 

levels 





o Surface 


52 

102 

Mi crowave 

1-10 km 

2 day 

20 cm + 1 cm 





Radiometer 




o Root Zone 


52 

102 

Model 

30-1000 m 

1 week 

20 cm + 1 cm 

o Types-Areal 

Geochemical 

102 

102 

Visible/SAR 

30 m 

annual 

20 ran/ 50 nm 

Extent 

cycles 







(peat, wet 

Agricultural 







lands) 

& Forestry 







o Texture- 

Agriculture 

102 

102 

Visible/SAR 

30 m 

annual 

20 nm/50 ran 

Color 

& Forestry 







o Erosion 

Geochemical 
cycl es 

102 

102 

Visible/SAR 

3Q m 

annual 

20 nm/50 nm 

o Elemental 

Geochemical 







storage 
o Carbon 

cycles 

102 

102 

Visible/SAR 

30 m 

monthly 

20 nm/50 nm 

o Nitrogen 


102 

102 

Visiole/SAR 

30 m 

monthly 

20' nm/50 nm 

Veqetation 








o Identifica- 

Hydrologic 

12 

52 

Visible, 

1 km 

7 day 

10-20 nm 

tion 

cycle, 

biomass dis- 
tributions i 
change, 



Near IR, 
Thermal IR 

Visible, 




o Areal Extent 

primary pro- 

12 

102 

30 m 

30 day 

30 nm 


duction. 



Near IR, 





plant 

productivity, 

respiration. 



Thermal IR 




o Condition 

nutrient 

102 

152 

Visible, 

30 m 

3 day 

10-20nm 

(stress. 

cycling, trace 



Near IR 




morphology. 

gas. 



Thermal IR, 




phytomass) 

source sinks, 
vegetation- 
climate inter- 
action, 
mi croc 1 imate 



SAR 




o Leaf area 


102 

202 

Visible, 
Near IR, 

30 m 

3 days 

50 nm 

index 







canopy 




Thermal IR, 




structure 
and density 




SAR 





in several ways. Fust, remote sensing data are inherently extreme, conditions. Finally, the combination of remote 

distributed (i.e., spatially disaggregated). As such they are sensing and modeling within a geographic information 

incompatible with many conventional models of environ- system framework (where inputs are organized employing 

mental processes wherein values for a given area are geographic coordinates) has special appeal because it 

“lumped” in some fashion or assigned to a specific node. appears that each needs the other to realize their 

Typically, these models do not readily accommodate maximum contribution. Thus remote sensing may play an 

remote sensing inputs. integral part in functional and ecological systems analyses 

Second, distributed models (both because of their wherein it may act as a key to the interfacing of biophysi- 

greater spatial specificity and because they often are more cal, geochemical, social, and economic data for effective 

of the deterministic than of the nodal or index type) may modeling purposes, 

offer the potential of greater forecasting power under 
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Figure 3. The variation in areal extent of the North American Boreal Forest derived from “reliable” conventional sources. 
Minimum extent common to all sources is in hatched pattern. Maximum extent from sources used is represented by dot pattern. 


Inventory Mapping 

) While the modes of explanation discussed above are Most users involved in geographic analyses want to 

examples of the scientific analyses which will be see a map of information relevant to their application, 

conducted employing EOS system, the objective of these Basemaps today are largely derived using photogram- 

studies will be to achieve an improved knowledge of metric techniques. It is in the area of thematic mapping 

those biochemical, geophysical, and socioeconomic (e.g., land cover, hydrology, soils, etc.) that considerable 

processes that affect life on this planet EOS can provide research is occurring on the use' of remotely sensed data. 

i significant help in this area. EOS will improve our ability Thematic mapping is an important component of any 

to inventory and map critical resources, facilitate land resources investigation (Simonett, 1976). The 

monitoring of critical resources and processes occurring Federal Mapping Task Force identifies Mapping Charting, 

over both large and small areas of the globe, and improve and Geodesy (MC&G) tasks as being: 

the accuracy of our models of the complex processes * Land Surveys (point positioning for geodesy, cadas- 

which impact life on this planet ■' ter, engineering); 
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* Land Mapping (planimetric, topographic, thematic); 

* Marine Mapping (nautical chart, bathymetry, floating 
aid, hazard) (Donelson, 1973). 

The above list could serve as general cartographic 
requirements for most countries of the world. Currently, 
such tasks are carried out within the United States’ 
national mapping programs primarily by the Defense 
Mapping Agency, the U.S. Geological Survey, and the 
National Oceanic and Atmospheric Administration. The 
U.S. federal MC&G task force, however, cannot meet the 
current requirements for maps, charts, and geodetic infor- 
mation. For example, the U.S. Geological Survey can 
satisfy only about 16 percent of the first priority needs for 
new mapping and 39 percent for revision of outdated 
maps (Donelson, 1973). 

The ability to produce thematic maps from remotely 
sensed data is directly related to our ability to extract data 
on the classes of thematic data of interest to a given user 
employing either manual or machine-assisted processing 
techniques. It is important to note that most maps 
produced for operational applications of a geographic 
nature are derived from visual image analysis techniques. 
Researchers in many disciplines are working to improve 
machine-assisted classification accuracies (Rosenfeld, et 
al, 1981; Rosenfeld, 1982; and Estes et al, 1983). This 
task, however, is formidable and there has been a general 
overselling of remote sensings ability to provide accurate 
thematic data in a rapid fashion. This overselling has 
made it difficult at times to obtain funds required to gain 
an in-depth understanding of the steps needed to 
improve existing thematic mapping capabilities. 

Monitoring 

The ability to detect changes in land cover patterns or 
biophysical characteristics is central to our ability to use 
remotely sensed data for planning and management 
purposes (Anderson, 1977). Monitoring of agricultural 
crops during a growing season can lead to die prediction 
of regional production. Rates of change of environmental 
parameters are highly variable by category and location. 
As an example, the encroachment of urban land use onto 
prime agricultural land at the rural-urban fringe occurs at 
a much faster rate than that of the regeneration of clearcut 
land to forest Thus variation in rates of change must be 
carefully assessed from both functional and spatial 
perspectives in order to provide appropriately stratified 
units amenable to the systematic extraction of change 
information. 

Interest has increased in recent years in the potential of 
remote sensing for monitoring environmental 
phenomena. Recent NASA programmatic interest in 
Global Biology and Global Habitability and the National 
Academy’s proposed International Geosphere Biosphere 
Program (IGBP) are largely predicated on the ability of 
remote sensing to monitor selected environmental condi- 


tions on a global scale (NASA, 1983a; NASA, 1983b; and 
Waldrop, 1984). These programs propose to collect infor- 
mation which has significant geographic applications. 
From research on desertification and deforestation to esti- 
mates of global elemental cycling and factors affecting 
climate, these programs call for monitoring and modeling 
research on an unprecedented scale. It is encouraging to 
note that these programs recognize the need for long- 
term research. Yet, from a reading of these and other 
similar documents it appears that there is a feeling that, at 
least within research funding agencies within the U.S., die 
image analysis techniques and processing, storage, and 
retrieval systems required to support these efforts are in 
place and only need to be applied. This is unfortunately 
not the case. 

Research using Landsat data for the detection and 
mapping of changes in land cover have demonstrated 
some potential, but much more needs to be done. To 
date, change detection studies employing machine- 
assisted processing techniques have demonstrated a 
potential for detecting and identifying areas of certain 
types of environmental change (Christensen and 
Lachowski, 1977; Friedman, 1978; Place, 1979; and 
Computer Systems Corporation, 1979). They have not, 
however, demonstrated the capability to detect changes 
consistently and with field verified absolute accuracies in 
the 80- to 90-percent range in a variety of geographic 
environments (Estes, Stow and Jensen, 1982; Estes, 
1985). 

Modeling 

An important aspect of remote sensing has been to 
develop models which can be driven by inputs derived 
from remotely sensed data. Models which employ 
machine-assisted processing of remotely sensed data to 
address specific geographic applications are still largely in 
the development stage. Considerable research emphasis 
must take place if we are to extend our understanding 
from die realm of systems structure into the area of 
systems processes and dynamics. 

The ability to predict consequences of trends in 
environmental conditions and to assess the potential 
impacts of management decisions through simulations is 
an important step towards understanding the state and 
dynamics of a variety of geographic phenomena. 

Remote sensing techniques have been applied to 
provide inputs to land capability and suitability models. 
Most operational usage, however, is limited to manual 
interpretation of aerial photographs. In many instances, 
acquiring and processing aerial survey data and their 
subsequent interpretation create the current bottleneck in 
the timely and effective operation of both land capability 
and suitability models. Land use updates typically cost 
50-75 percent of die original survey costs which severely 
restricts their number (Anderson, 1977). Many researchers 


consider the potential for semi-automated digital updates 
of land use surveys as the major, unfulfilled promise and 
potential advantage of satellite remote sensing. 

All land resources have inherent temporal and spatial 
components. It is necessary to predict both the quantity of 
aggregate change which is likely to occur in the future 
(i.e. the amount of land area likely to leave or enter a 
particular land cover category) and the most probable 
geographic location of change. The existing literature on 
the application of remote sensing to land cover spatial 
predictive modeling is very limited (Estes, Jensen and 
Simonett, 1980). So too is the literature on all modeling 
using remote sensing which documents the potential of 
remote sensing inputs to models on a quantitative basis 
(Lulla,- 1981;. Barker, 1983; Lulla, 1983). Research in this 
area must occur if the application of remotely sensed data 
to research on the biosphere is to achieve its true poten- 
tial. 


Conclusions 

In conclusion, both earth science and technology 
development have progressed to a point where the 
conduct of global science appears feasible. Indeed, the 
earth sciences community is already faced with problems 
that are truly global in extent Such problems require new 
approaches, which combine multi-disciplinary, multi- 
national teams focused on these problems, employing 
advanced technologies which can generate a type quan- 
tity and quality of data not previously available to the 
scientific community. EOS and the EOS program has this 
potential- Yet if we are to fully employ the potential of 
EOS it must be done within an information systems 
context, linking scientists together with both required 
facilities and each other. Such an approach can improve 
the global science community’s access both to data 
sources and processing capabilities. The science of the 
biosphere is a data-intensive activity and in its broadest 
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ABSTRACT 

In this paper we describe the architecture and working of a recently imple- 
mented knowledge-based GIS (KBGIS-II) that was designed to satisfy several gen- 
eral criteria for GIS. The system has four major functions that include query- 
answering, learning and editing. The main query finds constrained locations for 
spatial objects that are describable in a predicate-calculus based spatial object 
language. The main search procedures incude a family of constraint-satisfaction 
procedures that use a spatial object knowledge base to search efficiently for com- 
plex spatial objects in large, multilayered spatial data bases.These data bases are 
represented in quadtree form. The search strategy is designed to reduce the com- 
putational cost of search in the average case. The learning capabilities of the sys- 
tem include the addition of new locations of complex spatial objects to the 
knowledge base as queries are answered, and the ability to leam inductively 
definitions of new spatial objects from examples. The new definitions are added to 
the knowledge base by the system. The system is currently performing all its 
designated tasks successfully, although currently implemented on inadequate 
hardware. Future reports will detail the performance characteristics of the sys- 
tem, and various new extensions are planned in order to enhance the power of 
KBGIS-n. 
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1. INTRODUCTION 

In its simplest form, a geographical information system (GIS) may be viewed as a database 
system in which most of the data is spatially indexed, and upon which a set of procedures operate 
in order to answer queries about spatial entities represented in the database. On the basis of pre- 
vious research concerning the design and implementation of GIS, one may infer several require- 
ments that a GIS should- satisfy, as well as several principles of design and implementation that 
permit the satisfaction of such requirements. In this essay, we examine both the requirements and 
the associated principles, first in general terms and then in terms of a knowledge-based GIS 
(KBGIS-H) that has been recently implemented. 

1.1. Requirements of GIS 

Previous research (see, for example, Marble[l4], Caulkins[3] and Peuquet[l7]) suggests that 
the following general requirements should be satisfied in the design and implementation of most 
GIS: 

a) an ability to handle large, multilayered, heterogeneous databases of spatially indexed data 

b) an ability to query such databases about the existence, location and properties of a wide 
range of spatial objects 
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c) an efficiency in handling such queries that permits the system to be interactive 

d) a flexibility in configuring the system that is sufficient to permit the system to be easily 
tailored to accomadate a variety of specific applications and users. 

The preceding requirements imply that any GIS satisfying them to a significant degree will be a 
large and complex software system, designed to run on a hardware system with both extensive 
memory and fast processing capabilities. Hence the design, construction and testing of the 
software will be a large and complex task requiring the systemmatic application of techniques 
developed in computer science. 

1.2. Principles for satisfying the requirements 

There are several general principles that may be applied in order to facilitate the design and 
implementation of a GIS satisfying the four requirements listed above. A first principle, relating 
to all four of the requirements, involves the systematic application of techniques and approaches 
developed in a variety of subfields of computer science (CS). To date, few GIS have been con- 
structed on the basis of such systematic knowledge. Five subfields of CS appearing to have par- 
ticular relevance for GIS include: 

a) Software engineering, which provides a set of techniques to aid in the design, implementa- 
tion and testing of large software systems. Only recently have GIS researchers (eg Aron- 
son[l], Caulkins[3], and Marble[l4]), described the applicability of software engineering tech- 
niques to the construction of GIS. 

b) Database theory, which provides a selection of data models (see Peuquet[l7]), data struc- 
tures and database management techniques that may be used in satisfying the first three 
requirements listed above. 

c) The study of algorithms and complexity is applicable to GIS in its provision of a theoretical 
basis for algorithms that will search large spatial databases for complex spatial objects in an 
efficient manner. In particular, the emerging subfield of computational geometry (see 
Preparata and Shamos[l8]) promises much in the way of efficient spatial algorithms. 


d) Artificial intelligence studies computational techniques for solving problems which are either 
computationally intractable or for which there are no well-understood algorithms. The com- 
plexity of spatial objects and the size of the spatial databases suggests the applicability of 
AI techniques in designing data-structures and procedures for answering queries. 

e) Computer graphics and natural language processing are subfields of CS that provide tech- 
niques for constructing efficient and appropriate interfaces to GIS. 

A second principle, relating to the first three requirements listed above, involves the integra- 
tion of approaches and procedures developed in a variety of disciplines that are related to GIS. 
These disciplines include computer vision, image understanding and digital cartography (see, for 
example, Ballard and Brown[2]). Two reasons for this integration are: 

a) these disciplines all study the same basic problem of recognizing and reasoning about spatial 
objects implicitly encoded in spatially indexed data sets. Since their evolution has been 
somewhat independent, GIS research would benefit from the integration of approaches and 
procedures developed in these other disciplines. 

b) There has been a recent and growing realization that it is often a practical necessity to 
merge image data sets, such as LANDSAT scenes, with the more traditional datasets of GIS, 
such as digitized maps and vectorized representations of map features ( see Jackson [ll]). 
Computer vision and image understanding have developed techniques that will allow the 
integration of such capabilities into GIS. 

A third principle, relating to the third requirement, involves the application of procedures 
that reduce the search effort involved in answering queries, particularly by avoiding simple, 
exhaustive search strategies. As we note below, responding to queries about complex spatial 
objects in a large database is an inherently difficult computational task. One approach to reduc- 
ing search effort involves the application of various knowledge-based search techniques developed 
in AI research that employ the empirical and theoretical knowledge developed in several substan- 
tive fields of study, such as forestry, geography, geology and geophysics. 


A final principle, relating to the fourth requirement, is to construct GIS in such a way that 
they may be easily tailored to specific applications and/or users by the users themselves. In par- 
ticular, one may provide editors that allow users to augment and modify the system’s data and 

knowledge structures. One may also provide ’’learning” procedures that automatically augment 
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the system’s data and knowledge structures as queries are processed. 

1.3. Structure of the Essay 

In the main body of this essay, we discuss these requirements and principles in terms of a 
knowledge-based GIS (KBGIS-II) which has just been implemented. We first provide an overview 
of the system, including the main system functions and the system architecture. We then describe 
the language in which we represent spatial objects. In the sections following, we provide descrip- 
tions of the main components of the system, including the user interface, the spatial object 
knowledge base, the system editors, the high-level search procedures, the constaint-satisf action 
search procedure, the low-level search procedures and the learning procedures. We conclude with 
a summary of the system, and its relationship to the four requirements and the associated princi- 
ples. 

2. OVERVIEW OF KBGIS 

In this section, we provide an overview of the main functions and architecture of KBGIS-II, 
together with a summary of the manner in which the four requirements discussed above are met 
in the system. 

2.1. System Functions 

KBGIS-II is able to perform four main functions over which the user has control: 

a) In query mode, the system answers queries concerning spatial objects that are represented, 
usually in implicit form, in the spatial database. At present, there are two main forms of 
query, which may be viewed as functional inverses. The first query takes the general form: 


(FIND locations <# of cases > <spatial object> < spatial window >) (1) 

and is satisfied when the system finds sets of spatial locations at which the spatial object 
description is satisfied for the required number of examples in a specified spatial window. 
The spatial object is specified in terms of the spatial object language (SOL) defined below. 
The inverse query takes the general form: 

(FIND objects <spatial window > <object class>) (2). 

which is satisfied when the system finds all spatial objects that belong to a given class of 
spatial objects and that exist in a specified spatial window. 

A very large class of spatial data-base queries may be expressed in terms of queries (l) and 
(2), which include queries relating to decision-making tasks in which one seeks sets of locations 
that satisfy various constraints or optimality conditions.The first query, for example, may be used 
to find solutions to the travelling salesman problem. Furthermore, it is easy to satisfy an even 
broader set of queries, such as requests for statistical summaries of the spatial objects in given 
areas, by further processing the outputs resulting from queries (1) and (2). 

b) In learn mode, the system modifies and augments its knowledge base. In one form of learn- 
ing, which occurs by default in query mode, the system augments its knowledge base with 
the locations of a selected subset of newly discovered spatial objects. In a second form of 
learning, that currently must be invoked by the user, the system learns inductively how to 
define new spatial objects. The definitions of these new objects, and related information, are 
then added to to the system’s knowledge base. 

c) In edit mode the user is able to modify and augment the SOL and associated procedures as 
well as modifying the system’s knowledge base. 

d) In trace mode the user is able to follow the processing steps being executed by the system. 
Trace mode may be invoked in query, learn or edit modes. 


2.2. Architecture of the System 


The basic architecture of the system is illustrated in Figure 1. The user interface is a gen- 
eral module that controls the I/O behaviour of the system, including the parsing of user queries. 
Each of the four sets of procedures corresponds to one of the four main functions of the system. 
The function knowledge base contains knowledge about the functions that define the SOL, and is 
modifiable by the user. The spatial object knowledge base contains knowledge about spatial 
objects (such as their definitions and various heuristics), while the location tree data base contains 
the basic spatial data layers. 

2.3. KBGIS-II and GIS Requirements 

The requirement that the system handle very large, multilayered databases must be met 
partly in terms of the software system and partly in terms of the hardware on which the software 
runs. The requirement that the system be able to respond to queries about complex spatial objects 
is met in terms of the SOL, the search procedures adopted and the knowledge and data base 
structures employed. The requirement concerning search efficiency is also met in terms of the 
search procedures and data structures chosen, while the requirement of system flexibility is 
satisfied in terms of the editors available to the user. The requirement that the system handle 
large, multilayered databases must be met partly in terms of the software system and partly in 
terms of the hardware on which the software runs. The software design entailed by the require- 
ments described above has of necessity made the current hardware (VAX 11-750) sub-optimal for 
the task, and the size of the databases that can be handled at interactive speeds is thus limited. 

3. THE SPATIAL OBJECT LANGUAGE 

Before describing the components of the system represented in Figure 1, we provide a 
description of the spatial object language (SOL) that is used to represent objects in KBGIS-II. 
The choice of SOL is important for several reasons, including: 


a) The SOL defines the class of spatial objects about which the system may learn and about 
which the system may be queried. 

b) The choice of SOL has practical implications for the ease with which various computational 
tasks, such as search, may be carried out. 

c) The SOL is of value in revealing the computational complexity of the problem of finding 
spatial objects in the systems data bases. 

In this section, we describe the SOL in terms of its ability to represent spatial objects. 

An important feature of the SOL described below is the flexibility that it offers the user. As 
in similar predicate calculus-based languages (see, for example, Chamiak and McDermott[5]) the 
syntax is relatively simple and inference mechanisms are well-known. The user, however, has the 
option of defining a large numbers of predicates, functions, variables and constants in order to 
provide the language with an expressive power that is appropriate for a given spatial domain. 

3.1. The SOL defined 

A spatial object is defined as a set of spatial locations together with a set of properties 
characterizing those locations. In its most basic form, we define a location to be a set composed 
of some collection of the smallest spatial units, or "pixels”, that partition the area represented in 
the database. A location is not necessarily a connected set of pixels. One may then extend this 
definition of a location to include sets of locations. 

We employ three classes of properties in defining the SOL: 

a) Pixel properties, or PPROPs, are properties that characterize individual pixels in the data- 
base. Each layer in the spatial database has at least one associated PPROP. Examples of 
PPROPs are Landuse, Geology and Elevation. It is evident that the type of landuse, lithol- 
ogy or elevation are all properties that may be used to characterize either a single pixel or 
each pixel in a collection of pixels 


b) Pixel-group properties, or GPROPs, are properties that characterize the collection of pixels 
comprising some location, but do not characterize each single pixel in the collection. Exam- 
ples of GPROPs are Size, Shape and Orientation. 

c) Relational properties, or RPROPs, are properties that describe the relationship between two 
locations or between the properties of two locations. Examples of RPROPs include Dis- 
tance, Direction and Containment. 

In the SOL, a spatial object is described as a conjunction of members of the three classes of 
properties that are applicable in characterizing a given set of spatial locations. We represent 
these properties in terms of predicates that may be interpreted in terms of relationships between 
one spatial location and a set of property values, between two spatial locations and a set of pro- 
perty values or between the property values of two spatial locations. PPROP and GPROP pro- 
perties may be represented in the form: 

EQUAL ((U-FUNCTION LOCI) VAL) 

while RPROP properties may be represented either in the form: 

EQUAL ((B-FUNCTION LOCI LOC2) VAL) 

or in the form 

EQUAL ((B-FUNCTION <function of LOCI > <function of LOC2>) VAL) 

In these definitions, LOCi is a constant or variable representing a location; VAL is a constant or 
variable representing the value of some property; U-FUNCTION is a unary function of one loca- 
tion; B-FUNCTION is a binary function of two locations; and EQUAL is a predicate that indi- 
cates the truth or falsity of the statement. 

We now provide examples of the three classes of predicates: 
a) To describe a location whose landuse is agriculture, we use the PPROP predicate 
EQUAL ((LAND LOCI) AGRICULTURE) 

This predicate is satisfied when the variable LOCI is bound to a location (ie a set of spatial 


indices) for which it is true that the value of the landuse property is AGRICULTURE for 
each spatial index in the location. It is possible to verify the truth value of a PPROP predi- 
cate based on information stored in the appropriate layer of the spatial database. 

b) To describe a location whose area is between 50 and 60 resolution units we use the GPROP 
predicate 

EQUAL ((AREA LOCI) (50 60)) 

This predicate is satisfied if the variable LOCI is bound to a location having an area of 
between 50 and 60 pixels. The truth value of a GPROP predicate may be verified using 
computed or stored information. The system has a function for each GPROP, that com- 
putes the value of the corresponding property. 

c) To describe an object consisting of two locations that are separated by a distance of 10 to 
20 resolution units we use the RPROP predicate 

EQUAL ((DISTANCE LOCI LOC2) (10 20)) 

which is true when the locations bound to LOCI and LOC2 are separated by 10 to 20 units. 
The system has a function that computes the value of the property corresponding to each 
RPROP. 

The language also permits relational comparisons to be made between the properties of two 
groups of spatial indices using the arithmetic comparison operations EQ, GT, LT, GE, LE 
corresponding to =, >, <, >== and < = respectively. To specify, for example, that the area of 
one component of an object is greater that the area of another component, we may write: 

EQUAL ((GT (AREA LOCI) (AREA LOC2)) TRUE) 

Any of the predicates described above may be combined using the logical connectives A 
(AND) and V (OR) . Logical negation (-) may be combined with any PPROP or GPROP predi- 
cate by using the NOT-EQUAL predicate in place of the EQUAL predicate in the above expres- 
sions. As a simple example, we may choose to model a city as a commercial core ( LOCI ) 


surrounded by a residential annulus ( LOC2 ), in terms of the SOL representation 
EQUAL ((LAND LOCI) COMMERCIAL) 

A EQUAL ((AREA LOCI) (30 40)) 

A EQUAL ((LAND LOC2) RESIDENTIAL) 

A EQUAL ((AREA LOC2) (50 60)) 

A EQUAL ((CONTAINS LOC2 LOCI) TRUE) 

It is to be emphasised that the set of functions and arguments with which a spatial object 
may be represented in the system is definable by the user by way of the various editors. 

3.2. The Spatial Object Hierarchy 

We now define a special GPROP called "TYPE” that allows us to define high-level spatial 
objects that are themselves defined in terms of the basic P-, G- and R-PROPs. Hence we may 
partially order spatial objects, and so impose a hierarchical structure on them. In its simplest 
application, TYPE ascribes a name to a spatial object that is defined as a conjunction of 
PPROPS, GPROPS and RPROPS with specified values. An example of a high-level spatial 
object is: 

((TYPE X) GEOL-OBJl) 

<S=5> 

A ((LAND XI) FOREST) 

A ((AREA XI) LARGE) 

A ((SHAPE XI) CIRCULAR) 

A ((GEOL X2) 4) 

A ((ELEV X2) (50 100)) 


A ((AREA X2) MEDIUM) 


A ((DISTANCE Xl X2) (60 100)) 


A ((DIRECTION Xl X2) NORTH) 

(the predicate EQUAL is implicit, but omitted in this statement). 

The above definition states that any set of locations Xl and X2 satisfying the unary and binary 
constraints specified on the right hand side, constitute a location X of the high-level object named 
GEOL-OBJl. The relationship between the location X and the locations Xl and X2 may be 
chosen in some appropriate manner. For example X may be the convex hull of Xl and X2, the 
union of Xl and X2, or the centroid of Xl and X2. The unary constraints on the location Xl are 
specified by the two GPROP functions Area and Shape, and on the location X2 by the GPROP 
function Area, constraints on the locations Xl and X2 are specified by the two RPROP functions 
Distance and Direction. 

In general, high-level spatial objects may be defined in terms of other high-level objects 
using the TYPE property, in conjunction with. other PROPs, GPROPs and RPROPS. 

The use of the TYPE property in assigning a name to a high-level spatial object accom- 
plishes two objectives: 

a) it provides a convenient shorthand notation by means of which objects may be defined in 
terms of previously defined objects. Given for example that two objects named LAND-1 and 
LAND-2 have been defined, it is then possible to specify a new high-level spatial object 
LAND-3 as follows: 

((TYPE X) LAND-3) 

A ((TYPE Xl) LAND-1) 

A ((TYPE X2) LAND-2) 


A ((DISTANCE Xl X2) (20 30)) 


b) The TYPE property allows us to store newly found locations for high-level objects in a 
database indexed by object name and location. The indexing by location is achieved with a 
discrimination net, with each high-level object having its own discrimination net. These 
data structures are described below. 

Any high-level spatial object may thus be seen to form the root of a tree, the complete 
expansion of which yields leaves which are PPROPs, GPROPs and RPROPs. On this basis, we 
may then assign each high-level spatial object some measure of its complexity that takes into 
account the height of the tree that links it to the leaves, the number of component objects at 
each level in the tree, and the complexity of the spatial relations (RPROP predicates) at each 
level. 

For the purposes of describing the spatial object search process (see below), it proves con- 
venient to distinguish between high level spatial objects and primitive spatial objects. Any object 
that has been defined in the Spatial Object Database and hence has a name which is the value of 
the TYPE property, will be referred to as a high level spatial object. The term primitive spatial 
object will be used to refer to any connected set of pixels represented by some conjunction of 
PPROPS. It is easy to see that any high-level object may be ultimately defined in terms of primi- 
tive spatial objects, and an appropriate set of RPROPS and GPROPS. 

4. THE USER INTERFACE 

The User Interface allows the user to select from among the four main functions of the sys- 
tem (querying, editing, learning and tracing), and to supply the appropriate inputs and outputs. 
At present most user inputs into the system are by way of a key board, while outputs from query- 
ing the system are displayed on a graphics device. 


4.1. Querying 


In query mode the user may select one of the two fundamental queries (l), (2). Queries of 
both types may be entered either interactively or from a file. 

4.2. Editing 

In edit mode, the user may modify either the SOL and associated procedures, using the 
Function Editor, or the system’s knowledge base, using the Object Editor. 

4.3. Learning 

In learn mode, the user may cause the system to learn a definition of a new spatial object 
from given examples. Either the system searches for and generates these examples, or the user 
provides the examples. 

5. THE KNOWLEDGE AND DATA BASES 
5.1. The Spatial Object Knowledge Base 

The Spatial Object Knowledge Base stores both the definitions of, and useful information 
about, all objects known to the system. This knowledge base is implemented in terms of a slot 
and filler data structure (Nilsson [16]) and a discrimination net data structure (Charniak et. 
al.[4]). Information concerning object definitions, search heuristics, object classification, and 
object complexity, as well as low level search procedures that may be directly invoked in search- 
ing for spatial objects, is stored in the knowledge base. Information concerning known locations 
of spatial objects that have been previously found are stored in the discrimination net database. 

The slot names and information stored in the slot and filler data structures are shown in 
Figure 2. The information in each slot can be augmented, modified or deleted by means of the 
spatial object knowledge base editor. The information stored in this database may also be 
modified in the inductive learning mode of the system. 


The discrimination net database is used to store locations of known examples of spatial 
objects that are generated by the system during the course of answering user queries. Each object 
has its own discrimination net. The keys for the discrimination net are derivable given the name 
of a spatial object and the desired location tree address in the database. 

5.2. The Function Knowledge Base 

The Function Knowledge Base stores information on functions used by the system in search- 
ing for spatial objects. Information on the functions that evaluate the GPROP and RPROP pro- 
perties of spatial objects are stored in this knowledge base. 

The user has the ability to add, modify and delete information from this database using a 
function editor. Information on the ability to propogate constraints, the computational complex- 
ity, subroutine names, symmetry, range and learning related information are stored for each 
GPROP and RPROP function. The slot names and information stored in the knowledge base are 
shown in Figure 3. The system utilizes this information to control search for spatial objects and to 
generate information in learn mode. 

5.3. The Location Tree Data Base 

The Location Tree Data Base stores information on the spatial distribution of both region 
based PPROPS and linear features existing within the area covered by the database. 

5.3.1. Region Data 

The raw input for the region based PPROPS from which the location tree data base is built 
consists of a raster image for each layer such as landuse, geology or elevation. The conceptual 
data model utilized for data storage is the quadtree structure. This data structure is based on a 
recursive partitioning of space into four quadrants, and has been discussed extensively in the 
literature (see, for example, Samet[l9, 20, 21, 22], Hunter and Steiglitz [10] and Tanimoto and 
Pavlidis [24]) The location tree database extends the quadtree concept allowing for the encoding 


of multiple layers of thematic information, with more than one class of information on each layer 
being stored at an internal node of the location tree. As discussed in the section on the spatial 
object language, the PPROPS represent primitive pixel properties such as landuse, geology and 
elevation. There is a layer in the location tree corresponding to each such PPROP in the data* 
base. Each node in the location tree is structured as a three dimensional frame One slot is allo- 
cated for each PPROP in the database. Each layer (slot) in turn is a frame which contains the 
following slots : 

a) The VALUE slot stores the data values that occur in the area represented by the node. Each 
PPROP is quantized to have a maximum of fifty discrete values. At each intermediate node 
in the tree, a list of values occurring below the node, (together with the areal extant of each 
value) is stored. The data values are not averaged before storing as in the construction of 
the pyramid data structure described in Tanimoto[24] . The availability of the areal extant 
of each data value allows the dynamic computation of the color of a node. Thus a node 
may be classified as black, white or grey with respect to a particular data value depending 
on a variable percentage threshold. 

b) The DISTRIBUTION slot stores information on the areal extent of each data value in the 
area represented by the node. The DISTRIBUTION slot may be used to store more than 
one statistic for describing various aspect of the distibution of data values. The information 
stored in the DISTRIBUTION slot is used to compute node color based on flexible criteria as 
described above. 

During search to satisfy a query, each node visited by the search process is tagged using a search- 
tag. Allocation of space for these search-tag fields is a dynamic process and occurs during search. 
A unique search-tag field is used for each primitive object (connected region) that is part of a 
query. The information stored in the search-tag field is valid only during the dynamic extent of a 
query and may be removed and the space deallocated on completion of the search. 
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5.3.2. Linear Data 

The raw input for the linear based data consists of binary raster images of each linear 
feature such as roads and streamlines. This data is converted to vector form through edge follow- 
ing procedures and the resulting vector representations are stored in spatially indexed form, as 
properties of the nodes in the higher levels of the Location Tree Data Base. Each vector represen- 
tation of a linear feature consists of a series of straight line segments. These segments are stored 
in an array and cursors uniquely identify each breakpoint between segments. It is these cursors 
that are stored in the nodes of the of the Location Tree Data Base, permitting efficient retrieval of 
the subset of streams or other linear features within any specified block of the database. 

8. EDITORS 

KBGIS-II provides two editors, the Function Editor and the Spatial Object Knowledge 
Base Editor. The Function Editor permits the user to modify the function knowledge base and 
the Spatial Knowledge Base Editor permits the user to update the spatial object knowledge base. 
These editors are menu driven, and the user may alter the knowledge bases by selecting any of 
five modes. 

a) In ADD mode, the Object Editor may create a new spatial object. It queries the user for the 
FeatureType of the new object. An object definition package is then invoked and the user is 
guided through the construction of the object’s DefinedBy slot in terms of the SOL. Besides 
the definition, the editor also asks for other information such as class, heuristics, and linear 
and areal dimensions. In this mode, The Function Editor adds a new GPROP or RPROP. 
The user specifies the file name which contains the definition of the functions. If the new 
function propagates the constraint, the file should contain a function which can return a 
new search window. Besides the function definition, the editor also asks for associated 
parameters such as complexity, symmetry and domain. 

b) In DELETE mode the Object Editor deletes spatial objects from the knowledge base. The 
deletion of an object is allowed by the system only if it is not currently used as a component 


in the definition of any other spatial object in the knowledge base. The Function Editor 
deletes GPROPS and RPROPS from the function knowledge base. 

c) In MODIFY mode the user may modify the contents of any slot of either a selected spatial 
object or a function. The system ensures that logical consistency is maintained before 
allowing modifications to be made. 

d) In DISPLAY mode the user is allowed to browse through the knowledge base, examining 
selected components of selected objects or functions. 

e) In HELP mode the user is provided with aid in using the editors. 

f) In END mode, the user may save the changes made in the current session. 

7. SPATIAL SEARCH 

It is clear from the preceding discussion that procedures that search for spatial objects lie at 
the core of GIS in general and of KBGIS-13 in particular. In this section of the essay, we briefly 
outline the major principles and procedures that underly the search for spatial objects in KBGIS- 
II. It should be recalled that search efficiency is a major requirement in most GIS. 

7.1. Principles of Search 

Smith and Peuquet [23] outlined five principles that underlie the search procedures in 
KBGIS-II. We repeat those principles here with one further addition: 

a) The use of hierarchical decompositions in both data structures and in the search procedures 
applied to the data structures. 

b) The availability of different search strategies that may be chosen as the most efficicent in a 
given search context. 

c) The application of best first search procedures in which domain-specific knowledge is used to 
reduce the sets of locations that need to be searched in answering queries. 


d) The use of a constraint-satisfaction approach 

e) The use of recursion 

f) The use of dynamic updating of the system’s knowledge base in response to query satisfac- 
tion. 

The application of these principles is implicitly described in the detailed descriptions of the search 
procedures that are provided in following the sections. 

7.2. Search Procedures 

For convenience, we now provide a brief overview of the search procedures, based on the six 
principles enunciated above, that are employed in KBGIS-II when satisfying queries of type (1). 
When a query is entered by a user, it is parsed and checked for syntactic correctness and the user 
is prompted for any modifications. The (high-level) object of the query is then transformed into a 
semantic network representation, in which links represent RPROP relations (or constraints) 
between the subobjects of the query that must be satisfied. The network is then augmented with 
heuristic knowledge and the subobjects at the nodes are ordered. A constraint satisfaction pro- 
cedure is then applied to the nodes in the designated order. Search first occurs in the system 
knowledge base for specific subobjects that are known to satisfy the relational and spatial con- 
straints. If the satisfaction of the query cannot be accomplished by this lookup procedure, the 
search procedure is recursively called on the subobjects of the node. The recursion terminates in 
procedures that search the location tree database of the system. When a query is ultimately 
satisfied by a search of this database and when the search is considered computationally expen- 
sive, the result is stored in the system’s knowledge base for use in future search. 

In the above search process constraint satisfaction procedures are used to satisfy all unary 
(GPROP) and binary (RPROP) constraints used in the definition of an object, and as such pro- 
vide the core of our approach to spatial search. The general constraint satisfaction problem 
(CSP) has been studied by many researchers, including Mackworth [13, 15] and Haralick et. 
al.[6, 7] The problem may be stated as follows[l5] 


Given a set of m variables each with an associated domain and a set of constraining 
relations each involving a subset of the variables, find all possible m-tuples such that 
each m-tuple is an instantiation of the m variables satisfying the relations. 

Mackworth considers only CSP’s that are discrete, finite and for which the relations are unary 
and binary. 

The classical approach to the CSP entails the use of backtracking. The variables are instan- 
tiated in sequential order using labels selected from an ordered representation of the domain. 
Backtracking therefore corresponds to a depth first search of the combinatorial search space, with 
the truth values of intermediate predicates being tested in order to terminate unsuccessful 
searches as early as possible. As soon as the variables of any predicate are instantiated, the truth 
value of the predicate is tested. If true, then the process of testing and instantiation continues, 
but if false the process falls back to the variable last instantiated that has untried values in its 
domain and and reinstantiates it to its next value. 

Although the intrinsic merit of backtracking is that substantial portions of the generate and 
test search space (the cartesian product of all the variable domains) are eliminated by a single 
failure, it may still be very inefficient. Various improvements to the procedure have been sug- 
gested, such as preprocessing the network for node, arc and path consistency (see Mack- 
worth[l3, 15j) and forward looking tree search which prunes the search space through the use of a 
look ahead procedure (see Haralick[8]) 

We discuss our approach to spatial search in more detail in the following sections, first in 
terms of the high level search procedures that control search, then in terms of the constraint satis- 
faction procedures and finally in terms of the low- level procedures that search the location tree 


database. 


7.3. The SOL and Search Procedures 


The structure of the SOL may now be viewed in terms of its relation to the search pro- 
cedures. First, the use of a language that involves only unary (PPROP, GPROP) and binary 
(RPROP) relations allows the immediate construction of semantic network representations of the 
spatial objects. These representations have a natural spatial interpretation and provide a data 
structure upon which constraint satisfaction techniques may be naturally applied. Second, the use 
of the TYPE predicate in the SOL permits the natural use of recursive calls during the process of 
query satisfaction. 

7.4. The Complexity of Spatial Object Search 

As noted above, the SOL is of value in indicating the computational complexity of the 
search for spatial objects. By the complexity of search for a given object, we shall mean a meas- 
ure of the computational time that is required to find such an object, stated as a function of some 
measure of the object’s size. We now provide a simple and heuristic argument indicating that the 
search for spatial objects is in general a very difficult computational problem. We show by way 
of an example that it is easy to construct spatial objects that have a very simple representation in 
terms of the SOL defined above, and a very high order of search complexity. 

We may conceive of a spatial object that is comprised of n subobjects, which are linked in 
such a manner as to give rise to a connected graph. We shall use the number of subobjects (n) as 
the measure of the size of the spatial object. The links between subobjects may be represented in 
terms of some RPROP. We may further assume that each of the subobjects is characterized by 
some GPROP that can take on two values with equal probability. If we assume that the subob- 
jects are distributed at random in our spatial database, then the probability that any given loca- 
tion satisfies a GPROP constraint, (and hence constitutes an example of the corresponding subob- 
ject) is 1/2. The probability that n locations, in the configuration specified by the RPROPS, 
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satisfy the GPROP constraints is hence (— ). In the absence of preprocessing, and assuming 

w 

that subobjects are located at random in our spatial database, it is necessary to examine each n- 


tuple of locations (that lie in the configuration specified by the RPROPS ) to check whether the 
GPROP constraints are satisfied. It follows that we will have to search 0 (2“ ) times on average 
before finding an object with a set of nodes having the prescribed GPROP values. Furthermore, 
search could take significantly longer in some cases, and it is easy to express much more compli- 
cated objects in the SOL. 

Reduction in this time complexity is possibe if additional information is available to the 
search process. If subobjects are not distributed at random in the database then such information 
may be created by preprocessing and/or by making heuristic knowledge on the distribution pat- 
terns of objects available to the search process. Heuristic knowledge, in the above example, may 
consist of storing windows for each subobject where the probability of a location satisfying the 
GPROP (PPROP) constraints necessary to make it an example of the subobject are higher than 
for the rest of the database. Similarly a stored window for the parent spatial object will indicate a 
higher probability, within the window, that n-tuples of locations that lie in the spatial 
configurations specified by the RPROPS satisfying the GPROP constraints. Within this stored 
window there is thus exploitable correlation in the locations of subobject. 

Despite the possible speedup in search made possible by such preprocessing, the inevitable 
conclusion of the preceding remarks is that the search for arbitrary spatial objects describable in 
terms of our SOL is a problem with a high order of computational complexity. 

8. HIGH LEVEL OBJECT SEARCH 

High Level Object Search is the procedure used to search for locations of any high level 
object and is used to satisfy a query of type (1). It is first called upon to find examples of the 
’Query’ object. It may be called recursively if the ’Query’ object has other high-level objects as its 
descendents. The level of recursion permitted in the search process is unlimited. 

The first step in spatial search is to reduce the size of the search window using available 
knowledge concerning the locations of objects. This is accomplished by accessing the object 
knowledge base to find other high-level objects that are contextually related to the object sought. 


The system then determines if any known examples of such ancilliary objects exist within the 
search window. If so, sub-windows are constructed around each of the ancilliary locations, and 
are employed as likely areas for search. Hence a queue of windows is constructed, and the the sys- 
tem searches sequentially for the object in each of these windows until the required number of 
examples of the object are found. For any one window this task may be accomplished by the 
high-level object search procedure in two ways: 

a) Known locations of the object in the specified window of the spatial database may be 
retrieved from the spatially indexed knowledge base of known examples described above. 
The set of known locations stored in this knowledge base is not complete, and depends on 
the history of previous searches. At any time, this set generally contains only a fraction of 
the examples of the objects that exist implicitly in the spatial database. 

b) New locations of the object may be discovered through the process of search in the window. 
The process of searching for a new location of a high-level object with m sub-objects entails 
discovering m locations, one for each of its sub-objects, such that this set of locations satisfy 
all unary and binary constraints that define the parent object. If the query requires search- 
ing for n examples of the parent object, then n such sets of m locations each must be 
found. Searching for a new location of the parent object given a set of candidate locations 
for each of the m sub-objects is a constraint satisfaction problem. This problem consists of 
an allocation of locations to sub-objects from their candidate sets such that when all m 
assignments have been made, all constraints on and between sub-objects are satisfied. The 
next section will provide details on the design of the constraint satisfaction procedure imple- 
mented in KBGIS II. 

The task of determining which candidate locations for any one of the m sub-objects to 
employ in the constraint satisfation procedure, is a recursive specification of the task of determin- 
ing locations of a high-level spatial object. The recursion terminates in the task of determining 
the locations of a primitive spatial object. Known examples of such objects are not stored and 
their locations are always determined through a search of the spatial database. This search 


involves the determination of a connected set of pixels satisfying a conjunction of disjunctions of 
PPROP predicates and is achieved through an appropriate region growing process. Details of this 
primitive object search procedure are presented in a later section. 

High level object search may be represented in terms of the tree shown in Figure 4. We 
consider the task of finding new locations of the root high-level object O, shown in Figure 4, in a 
window. It is assumed that heuristic knowledge has already been applied to constrain the size of 
the window as described above. The number of sub-objects of a parent object is not bounded, 
and varies with the TYPE of the object, but has been taken as three in this example. This task 
may be addressed by using a constraint satisfaction procedure taking as input the locations of its 
sub-objects ol, o2 and o3, and as constraints the binary spatial relations that link ol, o2 and o3. 
The locations of the sub-objects ol, o2 and o3 that serve as input to the constraint satisfaction 
procedure may be known examples from the spatially indexed database of known examples or new 
locations discovered by search. 

Searching for new locations of ol, for example, is a recursive application of this task with ol 
as the parent object and oil, ol2 and ol3 as the sub-objects. The recursion terminates, for exam- 
ple, at oil, which is a primitive object and is searched for directly in the spatial database. 

The above procedure is followed in the search for new examples of all defined high level 
objects except in those cases where special purpose search procedures exist. Information on these 
procedures is stored in the Spatial Object Knowledge Base and is available to the control process. 
In these cases the special search function is directly called. The examples returned are absorbed 
into the constraint satisfaction process if the object in question was a subobject of some parent 
object being searched. This is the way in which defined objects that are linear features are 
searched for. This ability to interface to external search routines allows the system to utilize 
efficient special purpose algorithms that may be applicable in the search for a user defined object. 
In these cases the user may provide the system with necessary knowledge concerning the special 
purpose function through the function editor. 


9. CONSTRAINT SATISFACTION 


We now consider the task of constraint satisfaction at any intermediate level in the hierar- 
chy shown in Figure 4. For concreteness, we consider the procedure operating on the sub-objects 
ol, o2 and o3. These sub-objects are subject to both unary (GPROP) and binary (RPROP) con- 
straints. The high-level object search on the parent object O converts its definition into a seman- 
tic network, as shown in Figure 4 and this network, with ol, o2 and o3 as nodes, is passed to the 
constraint satisfaction procedure. Each node is linked by spatial relations (constraining arcs) to 
its siblings, and by parent and child links to the nodes immediately above and below it in the 
hierarchy. The child nodes are created only if the search procedure is recursively called on any of 
ol, o2 or o3. The constraint satisfaction procedure is concerned only with the spatial relations 
and operates on the set of nodes that are siblings (i.e. ol, o2 and o3 ). 

The above constraint satisfaction problem for spatial objects may be mapped onto the gen- 
eral constraint satisfaction problem described in a previous section of the paper. The variables 
represent the locations of the m sub-objects of a parent object while the domain of each variable 
is the set of candidate locations for the sub-object. A feature of the spatial search problem is that 
the knowledge possessed by the constraint satisfaction procedure concerning the variable domains 
(the set of candidate locations of each sub-object) may be partial. The spatial constraint satisfac- 
tion procedure may not generally assume that it is working with all the possible values of each of 
the m variables. Through exhaustive search, it is possible to determine all locations of each sub- 
object in the window, before beginning the backtracking search for m tuples of locations that 
satisfy the constraints necessary to form an example of the sought for parent object. This may be 
appropriate if one is searching for all examples of the parent object in the window, but is inap- 
propriate if one is searching for a small number of instances of the parent object. In the latter 
case the cost of exahustive search for all examples of sub-objects in a large database before begin- 
ning the constraint satisfation task for the parent objects may be computationally expensive and 
unwarranted. The spatial search procedure in KBGIS II therefore dynamically selects a constraint 
satisfaction strategy based on the nature of the spatial search to be performed. 


If a large number of examples of the parent object are to be found then all locations of sub- 
objects in the window are first determined before searching for consistent m-tuples of locations. 
Backtracking is used to discover the set of consistent m-tuples and this search may be speeded up 
using consistency and forward looking criteria as discussed above. 

If the number of examples of the parent object sought for in the window is small (in relation 
to the anticipated existing number) then we adopt a different strategy in which we alternate 
between recursive search for new locations of sub-objects and backtracking search for a consistent 
allocation of found locations to sub-objects. At any instant, the constraint satisfaction process 
operates on a subset of found locations of each sub-object within the window. The procedure 
explores this space in an attempt to find a consistent allocation. If it fails, the next task is to 
search for more labels that may be assigned to the sub-objects. The selection of which sub-objects 
to search for, and the selection of sub-windows of the original window in which to search is done 
so as to maximize the probability of finding consistent allocations corresponding to locations of 
the parent object. Once new locations for some of the sub-objects have been found the constraint 
satisfaction procedure resumes on the augmented variable domains. The process oscillates between 
constraint satisfaction and the search for new sub-object locations till the desired number of con- 
sistent allocations corresponding to examples of the parent object are found, or the procedure 
announces failure. 

We believe that the use of these two alternative strategies is an efficient way to accomplish 
spatial search. Studies involving this and other control issues will be presented in a forthcoming 
paper. 

10. PRIMITIVE OBJECT SEARCH 

The task given to the primitive search procedure is the determination of a specified number 
of locations of a primitive object. Each instance of the primitive object corresponds to a con- 
nected region in the search window. The primitive object is represented using a conjunctive nor- 
mal form expression involving only PPROP properties. An example of such an expression is: 


(((LAND X) (10 11)) V ((GEOL X) (1 2))) 

A (((ELEV X) (50 90)) V ((ASPECT X) (30 40))) 

The primitive object search procedure has two alternative strategies available, depending on the 
desired task: 

a) To find a small number of individual instances of a primitive object it uses region growing 
by SEED EXPANSION. 

b) To exhausively find examples of a primitive object in a window it uses region growing by 
CONNECTED COMPONENT LABELLING. 

For each strategy the primitive object search procedure can also select a cutoff resolution 
level in the location tree database. At this resolution level all nodes are classified as either black 
or white. 

Each node in the location tree database may be classified as WHITE, BLACK or GREY 
with respect to the primitive object the area of the node that satisfies the specified PPROP predi- 
cates. 

Each node in the location tree has an area depending on its height in the tree. Let N denote 
the number of levels in the location tree. Then a node at level N, referred to as the lowest level, 
has a height of 0, and an area of 1 unit (pixel). A node at height H (level : N - H) has an area of 
( 2 h ) 2 units. 

The selection of the area that must satisfy the predicates may be made using an absolute 
limit in the following manner. A node with an area of Y pixels may be considered BLACK only if 
it has more than (Y - X) pixels satisfying the predicates; GREY if it has between X and (Y - X) 
pixels satisfying the predicates; and WHITE if it has less than X pixels satisfying the predicates. 
This decision rule ensures that a node corresponding to a level with a node area of X units will be 
classified as only BLACK or WHITE preventing further descent of the tree by the region growing 
algorithm and fixing the resolution at the desired level. Such a rule enables the region growing 
procedure to take full advantage of compaction in the higher levels of the location trees and also 


restricts the resolution to the desired level. Selecting X equal to 0 allows the search to be carried 
out at full resolution. 

If a procedure wishes to view only a single level in the tree as in the case of a raster 
pyramid with no father-son links, then we may employ the following alternative rule. A node at 
some resolution level may be considered BLACK if more than X % of the area of the node 
satisfies the PPROP predicates specified, and WHITE if the area satisfying the predicates is 
between 0 and X %. Such a rule enables each layer to be viewed independently as a raster at the 
desired resolution. 

The first step in the primitive object search procedure is the selection of an appropriate 
region growing strategy and an appropriate resolution level. The selection of strategy and resolu- 
tion level is based on: 

a) The desired number of examples. 

b) The average size of the desired object in relation to the search window. 

If the search strategy selected is SEED-EIXPANSION then the constraint satisfaction pro- 
cedure that calls the primitive object search procedure narrows the search window through the 
propogation of binary spatial relations (RPROPS) involving the primitive object and other sub- 
objects of the queried object that have already been searched for. In this way focus of attention 
is acheived in the calls to the primitive object search procedure, using RPROP constraint propo- 
gation. The first step in SEED-EXPANSION is a systematic search for an initial seed in the 
search window. This search is done using heuristic knowledge based on the size of the object, 
which is an indicator of the depth at which black nodes might be expected to occur. This heuris- 
tic is used to control the search for the seed, causing it to switch from a depth first search of the 
tree to a breadth first search at the selected depth. Once a seed has been found, it is grown using 
a SEED EXPANSION procedure, that finds the complete areal coverage of the region within the 
window. The procedure followed ensures that the maximal block representation of the region 
grown is returned. The procedure is iterated till the desired number of seeds have been grown. 


Each node visited is tagged with a search tag, allowing the above procedure to systemati- 
cally search for and grow seeds till the entire window, has been searched or the desired number of 
examples have been found. 

If exhaustive search for the object is to be carried out then the CONNECTED COM- 
PONENT LABELLING algorithm is applied to the search window. This is an application of the 
conventional blob coloring region growing algorithm using the quad tree data structure. The pro- 
cedure is applied top down, marking BLACK nodes and merging connected components. The 
procedure descends to the next resolution level only when a GREY node is found and considers 
only the sons of the GREY node. In this way maximum use is made of the hierarchical tree struc- 
ture of the location tree database. The procedure decends no futher then the appropriate resolu- 
tion level where all nodes are classified as either BLACK or WHITE. All connected regions within 
the search window are returned by the procedure. 

11. LEARNING 

The main purpose of implementing learning procedures in KBGIS-II is to reduce query 
search time. It can be accomplished in to ways : either by remembering the results of previous 
search or by learning the definition of an object more precisely so that the search space may be 
pruned rapidly. Hence learning may be classified as either rote learning or inductive learning. 

11.1. Rote Learning 

Rote learning allows the system to memorize the examples of an object for which it has 
already searched, so that when it is asked to search for the same object again, it retrieves the pre- 
vious examples instead of searching again. It stores only predefined high level objects. 

Known examples are stored in a separate database that consists of a discrimination net for 
each defined spatial object. This database constitutes a part of the spatial object knowledge base. 
The discrimination nets used to store examples are basically pointer based quad trees. Each node 
in a discrimination net corresponds to a quad-tree window of the data base. Examples are stored 


at the minimum containing block i.e. the lowest node which completely contains the example. 
Each object is stored in a different discrimination net. The data base also has one other discrimi- 
nation net called the OBJECT-TREE that is used to store the name of the objects indexed by 
location. If the name of an object X is stored in a node Y, it implies that one or more examples 
the object X exist in the location tree database withim the quad-tree window corresponding to Y. 
This information is useful in answering queries of type (2). 

A query for the locations of an object in a quad-tree window is answered by returning all 
examples stored in the sub-tree under the query node. If the low level search returns a new exam- 
ple of an object, it is added to the proper discrimination net and the OBJECT-TREE is also 
updated. Obviously, all found examples cannot be stored because the space requirement will 
increase monotonically. Hence, after finding a new example the system has to make a decision as 
to whether the example should be stored. The decision taken depends on various factors. If the 
complexity of an object is low, it can be searched for easily and therefore it is not stored in the 
object base. If an object is recursively defined in terms of other high level objects, a decision 
must be made as to whether the subobjects should be stored or the parent object. Again the deci- 
sion taken depends on the cost of reconstructing the object from its subobjects. Besides the com- 
plexity, another criterion for storing the examples is the frequency with which they are sought. 

There are two ways of storing the object, either exact locations or rectangle approximation. 
The rectangle approximation of an object can be represented by specifying its area, eccentricity, 
centroid and orientation. 

11.2. Inductive Learning 

Inductive learning is used to provide a new definition of an object from a given set of exam- 
ples so that search for the object can proceed more efficiently. 

To learn a new definition of an object either user can give input definitions or system itself 
can generate new definitions. Since it is not possible to include all possible PPROPS, RPROPS 
and GPROPS to give definitions, the user specifies the appropriate values and system generates 
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the definitions using those properties. 

The inductive learning submodule of KBGIS-II is based on INDUCE[9j. INDUCE is a gen- 
eral purpose inductive learning program that takes a set of input rules and generates one or more 
output rules which are simpler, more general and consistent with the input rules. Given a set of 
input rules, it first finds a set of alternative consistent generalizations by locating the most 
promising clauses (which are common) and adding new clauses to each of them until a set of con- 
sistent generalizations of the event is obtained. After getting the cover, it extends the response of 
the functions and then selects the best generalization from this set and removes the rules for 
which this is a generalization. The criteria for selecting the best rules as well as the number of 
the output rules can be changed by varying parameters. 

INDUCE has the facility of providing background knowledge and the user can add arith- 
metic and logic rules for generalization. Besides background rules INDUCE also has the capabil- 
ity of adding new functions, equivalence predicates and extremity predicates. 

The language of INDUCE is different from that of KBBGIS-H. Hence translators are used 
to convert an object definition from one language to other. First we discuss the KBGIS-H- 
INDUCE translator and then INDUCE- KBGIS-II translator. 

11.2.1. KBGIS-H - INDUCE 

This translator takes a set of rules in KBGIS-II language and converts it into INDUCE for- 
mat. Besides the syntax transformations, it performs the following tasks: 

a) The current implementation of INDUCE does not allow disjunctions, therefore if an input 

rule has a disjunction, it splits the rule into two rules, e.g. 

(x, vx 2 )a y => [d=i] 

becomes 

*1 => [d= 1] 


X 2 => \d= 1] 


In KBGIS-II GPROPS and RPROPS do not have disjunctions but PPROPs can have a 
clause which has disjunction of two layers. Hence, for each disjunction it generates a new 
rule, i.e. for the following input rule 

((TYPE 0_1) LAKE) 

/\ (((LAND 0_2) 21) V ((GEOL 0_2) 22)) 

/\(((ELEV 0_2) (100 200)) V ((SLOPE 0_2) (20 40))) 

the output will be 

[TYPE (0_1) = LAKE] [LAND (0_2) = 21][ELEV (0_2) = 100..200] => [d=l] 

[TYPE (0_1) = LAKE] [GEOL (0_2) = 2l][ELEV (0_2) — 100..200] => [d=l] 

[TYPE (0_1) = LAKE] [LAND (0_2) = 21] [SLOPE (0_2) = 100..200] => [d=l] 

[TYPE (0_1) = LAKE] [GEOL (0_2) = 2l][SLOPE (0_2) — 100..200] => [d=l] 

b) INDUCE cannot handle real numbers and also the range of values should not be very large 
(typically < 100) therefore values of GPROPS and RPROPS are properly normalized. 

11.2.2. INDUCE-KBGIS 

This translator takes a set of rules in INDUCE language and converts it into KBGIS 
language. In the current implementation the language of KBGIS is not fully compatible with the 
language of INDUCE, therefore if there is any input clause that cannot be converted into KBGIS 
language it is ignored. 

If the system has learnt the definition of a new object, it is directly stored in the Spatial 
Object Knowledge Base. Otherwise, the system compares the new definition with the old one and 
if it is better the Spatial Object Knowledge Base is modified. In the current implementation due 
to language incompatibilities, sometimes the system may not be able to handle the INDUCE out- 
put. In such cases the user may interpret the output and update the Knowledge Base, using the 
Knowledge Base Editor. 


12. SUMMARY AND CONCLUSIONS 


KBGIS-II, as described in this essay, is currently implemented and running on a VAX- 
11/750 under the VMS operating system at the University of California, Santa Barbara. The sys- 
tem is programmed in Common Lisp, Pascal and C. We now briefly summarize the degree to 
which KBGIS-II meets the four requirements, laid out above, and the manner in which the four 
general principles, also listed above, are used to meet these requirements. It should be 
emphasized that the properties of the currently-implemented system are still under investigation 
and that there are plans to continue development of KBGIS-II. We therefore discuss both current 
research that is being performed using the current system and planned extensions to the system. 

12.1. Reqirements and Principles. 

The system is currently capable of handling large, multilayered, heterogeneous, spatially- 
indexed databases. The software design entailed by the overall system requirements described 
below has of necessity made the current hardware (VAX 11-750) a limiting factor in the size of 
the databases that can be handled at interactive speeds The transfer of the system to more 
appropriate hardware (such as a LISP machine) would resolve much of this problem. The system 
has the capability of responding to all the queries of types (1) and (2) that are expressible in our 
spatial object language (SOL). Although research is still in progress on the matter, the processing 
of the queries appears to be relatively efficient in the sense of reducing the average complexity of 
the search for spatial objects. The hardware deficiencies, however, do not permit the system to be 
truly interactive in the case of queries concerning complex spatial objects. KBGIS-II is flexible 
with respect to both domains of application and users. 

Concerning the role of the four sets of principles in allowing the system to satisfy these four 
requirements, we make the following comments: 

a) The development of the system suffered from a failure to adhere to the principled use of the 
techniques of software engineering, although it benefitted from the systemmatic aplication of 
techniques from database management (in the construction and storage of the location tree 


database, where the spatial image data is segmented into receivable areas that are paged in 
on demand, see Klinger[l2]); from the use of the theory of algorithms and complexity (in the 
construction of spatial search procedures); from the use of AI techniques (in the structuring 
of the knowledge base, in the design of the spatial search procedures and in the application 
of the learning procedures); and from the application of computer graphics techniques (in 
terms of the system output). 

b) The integration of techniques from computer vision and image processing provide the sys- 
tem with an ability to handle queries of a type not typically found in GIS, while allowing 
the system to integrate both image and digital cartogrtaphic data. 

c) The six priciples discussed in the general section on search greatly reduce the computational 
effort of the system in responding to queries, as compared with standard, exhaustive raster- 
based search procedures. 

d) Finally the availabilty of various editors allows the system to be easily tailored for use in 
various spatial domains and for various users. 

12.2. Investigation of System Performance 

Investigations of the system’s ability to handle various queries concerning a large geological 
database are currently underway, and will be reported in future publications. The main research 
effort involves an empirical analysis of the efficiency of the search procedures, and of the effects of 
varying various parameters that affect the efficiency of search. 

12.3. Extensions to the system 

Planning is currently underway concerning modifications to KBGIS-II that will both 
improve the efficiency of its current processing capabilities and extend its current capabilities. 
The planned extensions include: 

a) Adding computer cartographic capabilities and ordinary polygon processing functions that 
are similar to those found in such currently available systems, such as ARC/INFO. 


b) Adding ’’fuzzy” spatial object definitions and "fuzzy” reasoning. 

c) Adding a database and specialized processing functions for remotely-sensed data and an 
interface between this database and KBGIS-II; adding procedures for map-guided image 
interpretation; and providing data structures and procedures that permit joint querying of 
both the digitized cartographic and image databases. 

d) Providing the system with the capability of answering a class of queries that involve detec- 
tion of change over time. 

e) Providing procedures and control structures that permit the inductive learning procedures of 
the system to operate autonomously. 
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Figure 2 The Spatial Object Data Structure 


SLOT-NAME 

CONTENTS 

FeatureType 

Distinguishes between linear and 
region based features. 

DefinedBy 

The definition of the object, a 
disjunctive normal form expression 
in the spatial object language. 

Defines 

A list of heirarchically higher objects 
that are defined in terms of 
the object. 

Heuristics 

Other objects whose locations are 
contextually related to the locations 
of the object, together with the 
nature of the spatial relation 
involved. 

Complexity 

A measure of the complexity of 
search for new examples of the object. 

Size 

An approximation to the linear and 
areal dimension of the object. 

Procedures 

Pointers to low level algorithms 
that can operate directly on the 
image without recourse to the definition 
of the object. 

















Figure: 3 The Function Data Structure 


SLOT-NAME 

CONTENTS 

Propogation 

Indicates, in the case of GPROPS 
and RPROPS, if the function can be 
inverted to propogate constraints 
during search. 

Complexity 

A measure of the computational 
complexity of the function, used 
in the calculation of the complexity 
of spatial objects defined using the 
corresponding GPROP or RPROP. 

Symmetry 

Information on the symmetric 
properties of binary (RPROP) functions, 
permitting arguments to be switched 
by the search control, depending on 
dynamic object prioritization. 

Range 

The nature of the values that the user 
may specify as desired when the 
corresponding GPROP or RPROP is used 
to define an object. 

Subroutines 

The names of subroutines called by 
the function, used in system 
management. 
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NASA Pilot Lana Data System 

Dr. Jeffrey L. Star 
Prof. John E. Estes 

The Pilot Land Data System (PLDS), funded by NASA's 
Information Systems Office, Is a mu 1 1 1 - 1 nst I tut i on eftort 
directed towards solving the data access and management needs of 
scientists studying the land surface and working with NASA (NASA 
1984). Researchers at the University of California, Santa 
Barbara have been involved in PLDS since the tirsT planning 
workshops several years ago, and have contributed in several 
areas (Estes et al 198b). 

Workstations and PLDS 

The purpose of one small eftorT, in conjunction witn Mr. 
William Likens at NASA Ames Research Laboratory, has been to 
continue PLDS development activity in the area of scientific 
workstations. This was funded in part through Nasa grant NAG2 - 
3b2, "Workstations and the NASA Pilot Lana Data System". Witnin 
PLDS, workstations are viewed as the user Interface to the 
network, providing local processing capabiltles as well as remoTe 
access to communications, processing power, and data. 

The starf at NASA Ames Research Center has developed a 
Workstation Subsystem Databook (Likens et al, 1985). This volume 
reviews some of the hardware ana software which is now available 
from the commercial sector. A portion of our efforts under this 
contract has been to help gather data for tne aatabook, as wel I 
as insure the technical accuracy of the book. 



We have also parT i c I pated In technical meetings sponsored by Mr. 
Likens at the Jet Propulsion Laboratory. The meeting in Pasadena on 7 
March 85 considered workstations to be used In the funded PLDS science 
projects, and we discussed methods to use the expertise developed 
throughout the working group to assist the successful completion of the 
science scenarios. In particular, we met with Dr. Alex Goetz to 
discuss his needs for manipulating spectrometer datasets, witn Mr. Mike 
Martin to examine the overlap between his efforts on the Pilot 
Planetary Data System workstations and ours on the PLDS, ana wltn Mr. 
Fred Billingsley to discuss the DAVID catalog interchange system. 

Recent efforts on workstation developments have included 
coordination with Mr. Billingsley, and Mr. Pay I or from JPL. We have 
helped guide them to a set of specifications tor tne workstations tney 
will be purchasing, and demonstrated several workstations and software 
systems which we own ana operate to better aquainT tnem witn commercial 
opt I ons . 

On 18 December 85, Mr. Likens came to UCSB for a final meeting. 

We demonstrated desktop image processing systems we have purchased or 
developed, ana provided additional information tor tne databook. 

PLDS Science Steering Group 

On 1b Sept 85, and on 15,16 April 86, we participated In the 
PLDS Science Steering Group (SSG) meeting at Goddard Space Flight 
Center. At these meetings, we both reviewed progress to date 
towards meeting PLDS objectives In supporting the science 
scenarios, as wel I as formulated plans for demonstration projects 



In the near future. At the April PLDS SSG meeting Dr. Estes, In 
Dr. Rossow's absence, acted as chair of the SSG. A significant 
parT of the work at these meetings was to consider alternanves 
in the face of significantly less funding than anticipated. This 
Included prloriTlzIng portions of the program, as wel I as 
examining alternatives that build on efforts funded elsewhere. 

The draft of Dr. Estes’ commenTs are Included below. 
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SUMMARY 


John E. Estes and Jeffrey L. Star 

This document represents a final report of work conducted 
under grant NASA NAGW-45P during the period May 1, 1 985 to 
April, 1986. This document Includes material on research 
undertaken and some Indication of directions we propose will go 
in the coming year of this fundamental and applied research 
etTorT. 

The Information System Research Group research continues to 
focus on Improving the type, quantity, and quality of Information 
which can be derived from remotely sensed data. As we move I hto 
the coming year of our research, we wl I I continue to focus on 
Information science research Issues. In particular, we will 
focus on the needs of the remote sensing research and application 
communiTy which will be served by the Earth Observing System 
(EOS) and Space Station. Research conducted under this grant has 
been used to extend and expand existing remoTe sensing research 
activities at UCSB In the areas of geor ef erenced Information 
systems, machine assisted Information extraction from Image daTa, 
artificial Intelligence, and vegetation analysis and modeling. 

The program of research, documented In this progress report, 
is being carried forward by personnel of the University of 
California, Santa Barbara. The reporT documents our 
accompl Ishments In what we consider to be a five to ten year 
effort to prepare to take ful I advanTage of the system's 
capabilities of the platforms and systems associated with Space 



SUMMARY 


Station (e.g., EOS). Through this work, we have targeted 
fundamental research aimed at Improving our basic understanding 
of the r ole of Information systems technologies and artificial 
intelligence techniques In the integration, manipulation ana 
analysis of remotely sensed data for global scale studies. This 
coordinated research program is possible at UCSB due to a unique 
combination of researchers with experience in al I these areas. 

Several of our projects have used this grant as a catalyst 
to aid other NASA offices in the research, in the integration of 
remoteiy sensed and otner data into an information sciences 
framework. During this year we have received funds from NASA 
Code E/ I to supplement ISRG activities. These funds were 
proposed In September, 1985 (and received in late March, 1986) to 
cover a range of tasks. In addition, we have recleved partial 
tunding from the World Bank to aid in their Image processing and 
Information systems activities. 

The research currently being performed under this grant Is 
significant. The committees in which Grant personnel are 
Involved are also important. As we move Into tne Space Station 
era, we must constantly be aware that sensor systems being 
proposed for the Space Station Complex are, by in large, 
information systems. For them to achieve their full 
Interdisciplinary potential, a great deal of fundamental and 
applied research is needed. This grant Is facilitating this 


Appendices 



APPENDIX 1 


Methodologies of Mapping and Accuracy Determination 
for Regional Assessments of Natural Vegetation 


(Please see January 1, 1986 Progress Report for 
complete report). 


Large Area Vegetation Mapping 

Methodologies of Mapping and Accuracy Determination 
for Regional Assessments of Natural Vegetation 


John E. Estes 
Michael J. Cosentlno 

The mu I tl-resol utlon capabilities of the proposed Earth 
Observing Satellite (EOS) require that a conceptual framework 
for large area mapping needs be established In order to 
realize the Information potential of such a system. Within such 
a conceptual framework, techniques, procedures, and methods need 
to be developed: 1) to rapidly map large portions of the earth’s 

surface; and 2) to quantify map accuracy and confidence 
Intervals. Towards this end, we have produced several large area 
vegetation maps derived from satellite Imagery and verified their 
accuracy by comparing mapped classes at selected sample poJnTS 
with direct observations (of actual vegetation) from the ground 
and low altitude aircraft. 

We tested the usefulness of Landsat data In constructing 
accurate vegetation maps by developing maps for two study sites, 
Mt. Washington, N.H., and the Superior National Forest, MN. The 
results of these studies are discussed In detail In Appendices 
III and IV. Multidate Landsat scenes were used to manual ly 
Identify and map vegetation cover classes In order to determine 
the spatial extent of vegetation patterns In the forest. The 
methodology developed for producing the vegetation maps Involved 
two basic steps: classification and mapping of the Landsat 
Images, followed by subsequent accuracy verification. Two 
Landsat scenes (winter and summer) were acquired for each area. 
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A Knowledge Based System for the Classification 
of Agricultural Lands 


(Please see January 1, 1986 Progress Report 
for complete report). 



A KNOWLEDGE-BASED SYSTEM 


FOR THE 

CLASSIFICATION OF AGRICULTURAL LANDS 

by 

Charlene T. Sailer and John E. Estes 

One of the main interests of the Remote Sensing Information 
Sciences Reseach Group, University of California, Santa Barbara, 
has been the use of machine-assisted processing of satellite data. 
As discussed in previous progress reports, there are a number of 
areas in which the field of artificial intelligence may be applied 
to the analysis problem. The rest of this chapter outlines a 
development effort, in which we are attempting to develop an 
expert system to classify Landsat data for agricultural lands. 

Such an expert system could significantly reduce a human analyst's 
time and cost in this specific problem domain. We believe this 
kind of approach has generality, and will be particularly useful 
in the years ahead when systems such as EOS will be able to 
provide orders of magnitude more data for us to use. 

The objective of this research is to demonstrate the 
feasibility of incorporating reasoning into the computer-assisted 
classification of digital images. The model being developed will 
be structured as a rule-based production system which will 
simulate interaction with a digital image processing package. The 
system will focus on the digital classification of agricultural 
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for Commercialization and Time Dynamics 


(Please see January 1, 1986 Progress Report 
for complete report). 


Evaluation of Thematic Mapper Simulator Data 
for Commercialization and Time Dynamics 


by 

L.J. Mann, C.T. Sailer, and J.E. Estes 

INTRODUCTION 

This section of this report summarizes research evaluating Thematic Mapper Simulator 
(TMS) data for an improved understanding of the potential commercial agricultural applications. 
The thrust of the research was to examine processing techniques and improved information poten- 
tial available from multispectral data acquired with high temporal frequency. Such work will be 
valuable in analyzing the overall high resolution imaging system which is planned as part of the 
Earth Observing System (EOS) sensor compliment (e.g. High Resolution Imaging Spectrometer 
(HIRS). 

Our approach in this research was to examine the TMS data for two diverse and highly pro- 
ductive agricultural areas in southern and central California. Supervised and unsupervised 
classifications were performed and evaluated in an effort to monitor change through time of the 
agricultural crops in the study site. The information potential inherent in the temporal dimension 
of TMS data was addressed in this study to examine agricultural management issues which arise 
during farm operations. The commercialization aspect of the study was addressed by identifying 
the potential market for data of this type, the market’s data frequency requirements, and antici- 
pated product needs. Product need was examined both from a hardware and software standpoint, 
with emphasis placed on remote sensing data products. 

TMS data used in this evaluation was simulated by a Daedalus 074 high-altitude multi- 
spectral scanner flown on a U-2 and ER-2 aircraft at an approximate altitude of 65,000 ft. above 
ground datum. The Daedalus system has an IFOV of 1.3 mr and a ground resolution of 28 m. 
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EVALUATION OF THEMATIC MAPPER SIMULATOR DATA FOR COMMERCIAL 
APPLICATION TO NATURAL VEGETATION OF SOUTHERN CALIFORNIA 

by 

Michael Cosen+ino and John Estes 


This report briefly summarizes an evaluation of Thematic 
Happer Simulator (TMS) data for commercial application to 
natural vegetation of southern California. The approach was to 
examine the TMS data for several dates (7/2, 8/6, 9/17) over 
the same area and to determine whether phonological changes In 
natural vegetation could be detected. Species discrimination 
within the chaparral brush stands of southern California has 
typically been extremely difficult using 80m resolution MSS 
data. Stands of one type of green shrubs are very difficult to 
discriminate from another type of green shrubs using 80m 
resolution data. The Increased spatial and spectral resolution 
of TMS data, coupled with knowledge of phonological cycles of 
natural vegetation, could provide new and valuable data 
concerning the spatial distribution of key vegetation species. 

The temporal sequence of TMS data for this study was 
acquired during a period (July - September) when the flowering 
heads of the chamlse plant (Adenostoma f asc I cu I atum) began to 
dry and harden and turn a distinctive red/brown. Chamlse Is an 
Important chaparral species with a broad range over al I of 
California. Spectral discrimination of the spatial 
distribution of stands of chamlse would be highly desirable for 
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Geographic Information Systems for Scholars 


(Please see January 1, 1986 Progress Report 
for complete report). 



Geographic Information System for Scholars 

Prof. John E. Estes 
Dr. Jeffrey L. Star 

The Research Libraries Group Inc. (RLG) convened a meeting 
In January, 1986, at the Unlverlsty of California, Santa Barbara, 
to discuss the problems of managing geographic Information. RLG 
Is a corporat I on of 35 major research universities, which Is 
o»ned by the member universities. RLG has experience In the 
design and operation of Information systems and networked 
communications, which has been directed at problems of access to 
research materials for scholars. 

RLG has been active In the development and Implementation of 
standard descriptive formats for Information about a range of 
research materials, as wel I as systems for the retrieval of 
Information. In addition, RLG has an outstanding record In 
obtaining extramural funds for designated development projects, 
largely from private not-f or-prof 1 1 Institutions. 

The meeting focused on the growing need within the research 
library community to provide scholarly access to geographic or 
spatial ly distributed data. Including satel I Ite digital data, 
photography, and map data In both digital and analog formats. 
Important collections of such material are found at many 
universities which are part of RLG, as wel I as federal and state 
organizations. Representatives from NASA (M. Devlrlan), NOAA (J. 

I 


••si 

. W < 
; ; :3 i 


J; 

■ill 

• i 


;h! 

= *■* 'I 


il 1 


m 


.1 


i 




iff. 

! U: 


ill 


1 



1 


APPENOIX 6 

Committee Memberships 



APPENDIX 6 — Committee Memberships 


NASA Pilot Land Data System 

Science Steering Group: John E. Estes 

Technology Working Group: Jeffrey L. Star 

NASA Earth Observing System Data Panel 
John E. Estes, Jeffrey L. Star 

National Academy of Sciences 

Committee on Data Management and Computation: John E. Estes 

National Bureau of Standards 

Initial Graphics Exchange Standard: Jeffrey L. Star 

NASA Data Interchange Formats 
Jeffrey L. Star 

Geocarto International 

International Editorial Board: John E. Estes 

International Conference on Advanced Technology for Monitoring 
and Processing Global Information 

1985 Workshop Leader and Session Chairman: Terence R. Smith 

Research Libraries Group 

Task Force on Geographic Information: Jeffrey L. Star 
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APPENDIX 7 — Presentations and Symposia 


J.E. Estes. Improved Information Systems in Support of Global 
Science. AIAA/NASA Earth Observing Systems Conference, October 
1985, Virginia Beach. 

J.L. Star. Recent Advances In Large Area Vegetation Mapping. 
Man's Impact on the Global Environment. October 1985, Venice, 
Italy. 

Terence R. Smith. Invited Lecturer, Workshop on Brain Research, 
Cognition, and Artificial Intelligence, Ablsko, Sweden. 

Terence R. Smith. Invited Lecturer, Jacob Marshak 
Interdisciplinary Colloquium on Mathemat ics in the Behavioural 
Sciences, UCLA. 

Terence R. Smith. Invited Lecturer, Workshop on Modelling 
Complex Systems. I. Prlgoglne Center for Studies In Statistical 
Mechanics, Unlv. Texas, Austin. 
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